Introducing a New How Heroku Works

Humans, in their quest for knowledge, have always wanted to know how things work.

We sit in our bedrooms, kitchens and garages pulling things apart with eager hands, examining the bits with a glimmer in our eye as our fingers turn them around and around, wondering what they do, how they do what they do–hoping that everything still works without that pretty residual part that no longer seems to fit.

Introducing How Heroku Works

How Heroku Works follows this well trodden path. It dissects the platform, laying its innards bare upon the table, letting us gather around and look at what's inside.

Look here, and see the muscular router pumping packets to and fro. Look there, and see the dyno manager in all its glory, effortlessly orchestrating the platform.

Look yonder and see the database's WAL-E continuously archiving data bits.

And there, behold the dynos: like mitochondria they are the powerhouse of the platform, running your applications.

History

Like Galen's contributions to an early understanding of the human circulatory system, Heroku's venerable How it Works diagrams have been instrumental in advancing our understanding of Heroku.

image

This monument to progress provided a pioneering map of the Heroku platform. Etched with a surgeon's eye, its stylish and sleek lines drew praise from around the world, while its descriptive text provided some solace to those wanting to know more, wanting to understand how it worked.

Going forward

But we were left wanting. How, really, is the foot bone connected to the leg bone? How, really, is my code transformed from a git push into a slug into a release into something that executes, making use of config vars and third-party add-ons whilst unifying logging via logplex, all running on set of dynos, controlled by a dyno manager?

How Heroku Works is intended to answer these questions.

The article provides a high-level, accurate, technical overview of the Heroku platform.

Describing the platform required a certain balance. Too detailed, and you'll get lost in the mire of minutiae. Too broad, and you'll just have a caricature.

We hope you appreciate this struggle, and the resulting text – which is generous in its linking to other documents that provide deeper material.

Two views: static and dynamic

It's difficult to describe an organism. Do you describe the body parts, and how they fit together (the static view, the deploy-time view), or do you describe the journey of blood and electricity (the dynamic view, the run-time view)?

How Heroku Works describes both–using words–in a story that takes you through a journey of the major components of the platform. A sequential reading is necessary (some components are intertwined with others), but in the end you should have a pretty solid understanding of both the run-time and deploy-time views.

You should be rewarded with a better understanding of how it all fits together – how you go from code to executing bits.

Design choices

This description of the Heroku platform is radically different from its predecessor. Here are some of the design choices that went into its creation:

  • Audience: we're assuming a much more technically savvy audience in this description. You're tinkerers, makers. You want to know how stuff works.
  • Timing: this article is an optional read, but a really great read after deploying your first couple of apps.
  • Language: while describing the platform we found a few terms that were a little too nebulous, so we changed them. The "Routing Mesh" is now simply called "routers". "Dyno Manifold" is now called a "Dyno manager". Both concisely describe the components, and don't require you to look up additional descriptions.
  • Words: you will have to read – instead of gaze at awesome pictures. We'd love to iterate on this, and move it towards something that has a little more visual allure – but hope you instead enjoy an accuracy and detail difficult to depict in pretty pictures.

Please use the feedback box at the bottom of the article to send me any feedback.

Thank you!
Jon

The Heroku Changelog

The Heroku Changelog is a feed of all public-facing changes to the Heroku runtime platform. While we announce all major new features via the Heroku blog, we’re making small improvements all the time. When any of those improvements have any user-visible impact, you’ll find them in the changelog.

Some recent examples of posts to the changelog include new versions of the Heroku CLI, a new error code, and changes to logging.

To get the latest on changes like these, visit the Heroku Changelog, or subscribe via feed or Twitter.

Let a human test your app, not (just) unit tests


I’m a big believer in unit testing. We unit test our Rails apps extensively, and we’ve done so for years. On some projects, we do both unit testing and integration testing using Cucumber. I preach unit testing to everyone I can. I’d probably turn down a project if the client wouldn’t let us write tests (though this has never come up, and I don’t think it would be a hard sell).

But for a long time, that’s all I did on my projects. Our clients and users would find the bugs that got past the developers. They were, in effect, our QA testers. (I think a lot of small/agile teams are the same way; based on my experience, I’d be surprised if more than 20% of Rails projects were comprehensively tested by a human.)

This is not right. A good QA tester is worth the surprisingly modest expense.

If I unit test, do I really need to hire a QA tester?

Keep on writing unit tests. But unit tests and human testing are two completely different things. They both aim to increase code quality and decrease bugs, but they do this in different ways.

Developer (unit) testing has three benefits. It:

  • Makes refactoring possible. Don’t even try to refactor a large app without a test suite.
  • Speeds up development. I know there are some haters who deny this, but they’ve either never really given unit testing a chance, or their experience has been 180º different than mine.
  • Eliminates some bugs. Not all, but some.

Human testing has related, but somewhat different, benefits. It:

  • Eliminates other bugs. Unit tests are great for certain categories of bugs, but not for others. When a human walks through an application with the express purpose of making things break, they’re going to find things that developer-written unit tests won’t find.
  • Acts as a “practice run”. Before letting a client, boss, or user see a change, let a QA tester see it. You’d be surprised how many 500 errors and IE incompatibilities you can avoid.
  • Gives you confidence before you deploy. After working with good QA testers, I can’t imagine deploying an app to production without having a QA tester walk through it.
  • Saves you time. If you don’t have a QA role on your project, your developers will be defacto testers. They probably won’t do a good job at this, since they’ll be hoping things succeed (rather than making them fail). And their time is probably more expensive than a good tester’s time.

How to use a QA tester in an agile project

Agile testers should do four things.

First, they should verify or reject each story that is completed. Every time a developer indicates that a feature or bug is completed, whether you use a story tracker or index cards, a QA tester should verify this. Don’t deploy to production until the tester gives it a thumbs-up.

Second, they should do exploratory testing after every deploy. A few minutes clicking around in production can sniff out a lot of potential errors.

Third, they should test edge cases. What happens if a user types in a username that is 300 characters long? What they try to delete an item that is still processing? What if they upload a PDF file as an avatar? Testers are great at this sort of thing.

Fourth, they should test integrations. Unit tests can’t (and shouldn’t) test multi-step processes. Integration testing tools like Cucumber are OK, but don’t catch everything. Identify the main multi-step processes on your site, and have a human verify them every time they change.

Expect a tester to increase your development costs by 5%-10%. We find that 1 hour of testing for every 6 hours of developer time is a reasonable estimate. Our testers cost about 40% less than our developers. So on a typical invoice, testing services are about 10% of development services.

Bill separately for testing. Don’t just roll it into your developer rate. Clients are more likely to object to a 10% increase in your main hourly rate than a separate, lower testing line item.

Finding a good tester

There are two main ways to find a tester.

First, you can train one. Tech-savvy folks who aren’t programmers are a good option. They understand enough to fit in with your development process, but are happy testing and not coding. If you find the right person, they can be testing in no time, and won’t cost a ton of money.

Second, find one that understands agile development. There are plenty of professional testers out there, but most of them do waterfall testing: spend 3 weeks writing test cases, get release from developers, and spend 3 weeks testing. I can say, without hyperbole, that this is how exactly 0% of Rails development projects work. Look for the small number of testers that actually have experience with iterative development, flexible scope, and rapid turnaround. You can sometimes find these people at agile events (conferences or user groups). Otherwise, ask other developers. I found one via referral, and I’ve since referred him to others. This second category will probably be more expensive than the first, but if you want to ship the best code you can, go with this route. Just make sure you avoid a Zompire Dracularius.

Let a human test your app, not (just) unit tests

        <img src="http://www.billionswithzeroknowledge.com/wp-content/uploads/2009/04/failwhale.jpg" width="300px" />

I’m a big believer in unit testing. We unit test our Rails apps extensively, and we’ve done so for years. On some projects, we do both unit testing and integration testing using Cucumber. I preach unit testing to everyone I can. I’d probably turn down a project if the client wouldn’t let us write tests (though this has never come up, and I don’t think it would be a hard sell).

But for a long time, that’s all I did on my projects. Our clients and users would find the bugs that got past the developers. They were, in effect, our QA testers. (I think a lot of small/agile teams are the same way; based on my experience, I’d be surprised if more than 20% of Rails projects were comprehensively tested by a human.)


This is not right. A good QA tester is worth the surprisingly modest expense.


<h4>If <div class="post-limited-image"><img src="http://feeds.feedburner.com/~r/RailSpikes/~4/8vuia0PgjwA" height="1" width="1" alt=""/></div><!--more--> unit test, do I really need to hire a QA tester?</h4>


Keep on writing unit tests. But unit tests and human testing are two completely different things. They both aim to increase code quality and decrease bugs, but they do this in different ways.


Developer (unit) testing has three benefits. It:


<ul>
<li><strong>Makes refactoring possible.</strong> Don’t even try to refactor a large app without a test suite.</li>
</ul>


<ul>
<li><strong>Speeds up development.</strong> I know there are some haters who deny this, but they’ve either never really given unit testing a chance, or their experience has been 180º different than mine.</li>
</ul>


<ul>
<li><strong>Eliminates some bugs.</strong> Not all, but some.</li>
</ul>


Human testing has related, but somewhat different, benefits. It:


<ul>
<li><strong>Eliminates other bugs.</strong> Unit tests are great for certain categories of bugs, but not for others. When a human walks through an application with the express purpose of making things break, they’re going to find things that developer-written unit tests won’t find.</li>
</ul>


<ul>
<li><strong>Acts as a “practice run”.</strong> Before letting a client, boss, or user see a change, let a QA tester see it. You’d be surprised how many 500 errors and IE incompatibilities you can avoid.</li>
</ul>


<ul>
<li><strong>Gives you confidence before you deploy.</strong> After working with good QA testers, I can’t imagine deploying an app to production without having a QA tester walk through it.</li>
</ul>


<ul>
<li><strong>Saves you time.</strong> If you don’t have a QA role on your project, your developers will be defacto testers. They probably won’t do a good job at this, since they’ll be hoping things succeed (rather than making them fail). And their time is probably more expensive than a good tester’s time.</li>
</ul>


<h4>How to use a QA tester in an agile project</h4>


Agile testers should do four things.


First, they should verify or reject each story that is completed. Every time a developer indicates that a feature or bug is completed, whether you use a story tracker or index cards, a QA tester should verify this. Don’t deploy to production until the tester gives it a thumbs-up.


Second, they should do exploratory testing after every deploy. A few minutes clicking around in production can sniff out a lot of potential errors.


Third, they should test edge cases. What happens if a user types in a username that is 300 characters long? What they try to delete an item that is still processing? What if they upload a <span class="caps">PDF</span> file as an avatar? Testers are great at this sort of thing.


Fourth, they should test integrations. Unit tests can’t (and shouldn’t) test multi-step processes. Integration testing tools like Cucumber are OK, but don’t catch everything. Identify the main multi-step processes on your site, and have a human verify them every time they change.


Expect a tester to increase your development costs by 5%-10%. We find that 1 hour of testing for every 6 hours of developer time is a reasonable estimate. Our testers cost about 40% less than our developers. So on a typical invoice, testing services are about 10% of development services.


Bill separately for testing. Don’t just roll it into your developer rate. Clients are more likely to object to a 10% increase in your main hourly rate than a separate, lower testing line item.


<h4>Finding a good tester</h4>


There are two main ways to find a tester.


First, you can train one. Tech-savvy folks who aren’t programmers are a good option. They understand enough to fit in with your development process, but are happy testing and not coding. If you find the right person, they can be testing in no time, and won’t cost a ton of money.


Second, find one that understands agile development. There are plenty of professional testers out there, but most of them do waterfall testing: spend 3 weeks writing test cases, get release from developers, and spend 3 weeks testing. I can say, without hyperbole, that this is how exactly 0% of Rails development projects work. Look for the small number of testers that actually have experience with iterative development, flexible scope, and rapid turnaround. You can sometimes find these people at agile events (conferences or user groups). Otherwise, ask other developers. I found one via referral, and I’ve since referred him to others. This second category will probably be more expensive than the first, but if you want to ship the best code you can, go with this route. Just make sure you avoid a <a href="http://www.zompire-dracularius.com/">Zompire Dracularius</a>.
      <img src="http://feeds.feedburner.com/~r/RailSpikes/~4/8vuia0PgjwA" height="1" width="1" alt=""/>

Building a Video Delivery Network in 48 hours

Last weekend, I participated in my first Rails Rumble. Rails Rumble is a 48-hour app building contest. We started from scratch Friday evening – you can have concepts and notes on paper, but no code or digital UI assets – and stopped Sunday evening, after 48 hours. You can use open-source code and public web services, and we made liberal use of both.

Our team consisted of myself and three of the Sevenwire crew: @fowlduck, @brandonarbini, and @steveheffernan. That’s two developers (Nate and myself), one developer/UI combo (Brandon), and one UI guy (Steve). All in all, a really good mix for the app. We’re also the team behind two video encoding services: Zencoder and FlixCloud.

Check out our app (and the 21 other great finalists) and vote at http://r09.railsrumble.com/entries. Voting ends this weekend, so do it soon.

The App

Our project was ZenVDN, a video distribution network. In other words, a place to upload video that you want to publish, i.e. via your blog or website. Upload one or more videos, and they’re transcoded into web and mobile formats, and sent to a Content Delivery Network for distribution.

After that, you’re given a page to manage each video, with HTML embed code to plug the video directly into your blog or website. You can also link directly to the videos, if you want to use your own player. And finally, each video has a public page on the ZenVDN site if you want to share the video directly.

So it’s a complete start-to-finish video publishing platform. Let’s say you’re Ryan Bates of RailsCasts. You can compress, upload, and host your own video files manually, or you can use a service like ZenVDN to do that for you. (I emailed Ryan about this, by the way, and he prefers the manual route. 😉

Another way to look at it: a better YouTube for video publishers. YouTube and its peers were designed for wide-scale video sharing, not for video producers and content owners. If you don’t mind YouTube’s quality and watermark, and you don’t mind your video being shared publicly on YouTube, ZenVDN probably isn’t for you. But if you want better quality and to own distribution of your videos, check us out.

What’s cool? A multi-file uploader with progress; direct uploads to our CDN, for speed and scalability; video watermarking; video thumbnails; wide input video support; a Flash Player integrated into the embed code; and detailed statistics (by video, by date, by format).

What’s missing? Again, it’s a working end-to-end product, but we’d like to do a lot more. Examples: Ogg support (for HTML 5), an RSS feed for videos, more public/sales information, and better privacy controls.

And, of course, paid subscriptions. We hoped to get e-commerce done during the Rumble, probably using Spreedly, but we ran out of time. Maybe in a 72 hour Rumble. In the meantime, our Free level limits the number of videos you can upload, and the amount of video you can stream. A paid level would increase these limits and let you use your own watermark (instead of the ZenVDN watermark on free accounts).

All in all, we’re really happy with where we ended up. I’m proud to say that Obie briefly questioned whether we could build the whole thing in a weekend. That’s praise.

The experience

The Rumble was way more fun than I expected. I had just worked a hard week, and a part of me was dreading the prospect of a long weekend of work. But it was actually a blast.

Why? Development flow, I think. Development flow is a really fun experience. It’s probably why most of us are developers, after all. We like to build things; we like to solve problems; and we like to work effectively. I’ll sometimes go days, or even weeks, without experiencing concentrated development flow like I did during the Rumble. (Stupid meetings.) So the Rumble was a really great experience.

Our team really clicked. We had a great mix of skills: across the four of us, we had one designer, two front-end coders, and three back-end coders. Besides the initial design concepts, every task could have been handled by more than one person, so tasks rarely sat in the queue for long.

We tried really hard to avoid rushing at the end. We stopped development with 3 hours to go, and two of us started testing, while two others recorded the screencast for the homepage. But it didn’t work out quite so smoothly. The screencast wasn’t done until about T-30 minutes, and we were checking in fixes and refinements until about 6:45. Then a minor Git snafu, and panic ensued. Our final submission came down to the wire.

Finally: sleep and breaks. Call me weak, but I like to sleep. I got 8 hours/night during the Rumble, which definitely improved my experience (and the quality of my code). We ate lunch at our desks, but took a 90-minute dinner break on Saturday, and stopped several times for a game of darts.

Lessons learned

1. Blitzes can be fun and effective. I’m inclined to try a Rumble-like iteration every few months, to avoid project monotony, and to ship stuff quickly when necessary. I did ~3-4 days of work during the Rumble, so I figure a 3 day Rumble plus 2 days of vacation evens out to about a week of work.

2. Focus is essential. If I had three 30-minute meetings during the Rumble, my contribution would have been cut in half. Good reminder of makers’ schedules.

3. Don’t rush at the end. We left three hours for testing and padding. We should have left six.

4. Prioritize well. If we had tackled e-commerce on Day 2 (as I almost did), we wouldn’t have finished our core product. Build the Minimum Viable Product first, and then move on to concentric circles of improvement.

5. Small projects can work. I have a bias against small projects; 3 month gigs feel so much more comfortable to me as a consultant than 3 week gigs. But done properly, shorter projects can work fine. We did ~$15,000 worth of work over the course of the weekend. No reason that experience couldn’t translate into a client project.

Next steps

So what’s next for ZenVDN? We’d really like to get a few video publishers using it. (Talk to me if you want to be a beta customer.)

And we want to monetize the site, of course.

We think it complements our suite of video-related products well – Zencoder is the core software; FlixCloud makes it an easy web service; and ZenVDN brings video publishing one step closer to the producers.

We have some other ideas for ZenVDN. But if you have an interest in online video, or are a publisher/producer yourself, we’d love to talk more!

Building a Video Delivery Network in 48 hours

        Last weekend, I participated in my first Rails Rumble. Rails Rumble is a 48-hour app building contest. We started from scratch Friday evening – you can have concepts and notes on paper, but no code or digital UI assets – and stopped Sunday evening, after 48 hours. You can use open-source code and public web services, and we made liberal use of both.<img src="http://railspikes.com/assets/2009/8/28/zenvdn.png" />


Our team consisted of <a href="http://twitter.com/jondahl">myself</a> and three of the <a href="http://sevenwire.com/">Sevenwire</a> crew: <a href="http://twitter.com/fowlduck">@fowlduck</a>, <a href="http://twitter.com/brandonarbini">@brandonarbini</a>, and <a href="http://twitter.com/steveheffernan">@steveheffernan</a>. That’s two developers (Nate and myself), one developer/UI combo (Brandon), and one UI guy (Steve). All in all, a really good mix for the app. We’re also the team behind two  <a href="http://zencoder.com/">video encoding services</a>: <a href="http://zencoder.com/">Zencoder</a> and <a href="http://flixcloud.com/">FlixCloud</a>.


Check out <a href="http://zenvdn.com/">our app</a> (and the 21 other great finalists) and vote at <a href="http://r09.railsrumble.com/entries">http://r09.railsrumble.com/entries</a>. Voting ends this weekend, so do it soon.


<h3>The App</h3>


Our project was <a href="http://zenvdn.com/">ZenVDN</a>, a <div class="post-limited-image"><img src="http://feeds.feedburner.com/~r/RailSpikes/~4/wNpuUdd3sAM" height="1" width="1" alt=""/></div><!--more--> distribution network. In other words, a place to upload video that you want to publish, i.e. via your blog or website. Upload one or more videos, and they’re transcoded into web and mobile formats, and sent to a Content Delivery Network for distribution.


After that, you’re given a page to manage each video, with <span class="caps">HTML</span> embed code to plug the video directly into your blog or website. You can also link directly to the videos, if you want to use your own player. And finally, each video has a <a href="http://zenvdn.com/v/902d4766bf0b3f364266166150e1195aba043d91">public page</a> on the ZenVDN site if you want to share the video directly.


So it’s a complete start-to-finish video publishing platform. Let’s say you’re <a href="http://railscasts.com/">Ryan Bates</a> of RailsCasts. You can compress, upload, and host your own video files manually, or you can use a service like ZenVDN to do that for you. (I emailed Ryan about this, by the way, and he prefers the manual route. ;)


Another way to look at it: a better YouTube for video publishers. YouTube and its peers were designed for wide-scale video sharing, not for video producers and content owners. If you don’t mind YouTube’s quality and watermark, and you don’t mind your video being shared publicly on YouTube, ZenVDN probably isn’t for you. But if you want better quality and to own distribution of your videos, check us out.


What’s cool? A multi-file uploader with progress; direct uploads to our <span class="caps">CDN</span>, for speed and scalability; video watermarking; video thumbnails; wide input video support; a <a href="http://flowplayer.org/">Flash Player</a> integrated into the embed code; and detailed statistics (by video, by date, by format).


What’s missing? Again, it’s a working end-to-end product, but we’d like to do a lot more. Examples: Ogg support (for <span class="caps">HTML 5</span>), an <span class="caps">RSS</span> feed for videos, more public/sales information, and better privacy controls.


And, of course, paid subscriptions. We hoped to get e-commerce done during the Rumble, probably using <a href="http://spreedly.com/">Spreedly</a>, but we ran out of time. Maybe in a 72 hour Rumble. In the meantime, our Free level limits the number of videos you can upload, and the amount of video you can stream. A paid level would increase these limits and let you use your own watermark (instead of the ZenVDN watermark on free accounts).


All in all, we’re really happy with where we ended up. I’m proud to say that <a href="http://blog.obiefernandez.com/">Obie</a> <a href="http://r09.railsrumble.com/teams/zencoder#comments">briefly questioned</a> whether we could build the whole thing in a weekend. That’s praise.


<h3>The experience</h3>


The Rumble was <em>way</em> more fun than I expected. I had just worked a hard week, and a part of me was dreading the prospect of a long weekend of work. But it was actually a blast.


Why? Development flow, I think. Development flow is a really fun experience. It’s probably why most of us are developers, after all. We like to build things; we like to solve problems; and we like to work effectively. I’ll sometimes go days, or even  weeks, without experiencing concentrated development flow like I did during the Rumble. (Stupid meetings.) So the Rumble was a really great experience.


Our team really clicked. We had a great mix of skills: across the four of us, we had one designer, two front-end coders, and three back-end coders. Besides the initial design concepts, every task could have been handled by more than one person, so tasks rarely sat in the queue for long.


We tried really hard to avoid rushing at the end. We stopped development with 3 hours to go, and two of us started testing, while two others recorded the screencast for the homepage. But it didn’t work out quite so smoothly. The screencast wasn’t done until about T-30 minutes, and we were checking in fixes and refinements until about 6:45. Then a minor Git snafu, and panic ensued. Our final submission came down to the wire.


Finally: sleep and breaks. Call me weak, but I like to sleep. I got 8 hours/night during the Rumble, which definitely improved my experience (and the quality of my code). We ate lunch at our desks, but took a 90-minute dinner break on Saturday, and stopped several times for a game of darts.


<h3>Lessons learned</h3>


1. Blitzes can be fun and effective. I’m inclined to try a Rumble-like iteration every few months, to avoid project monotony, and to ship stuff quickly when necessary. I did ~3-4 days of work during the Rumble, so I figure a 3 day Rumble plus 2 days of vacation evens out to about a week of work.


2. Focus is essential. If I had three 30-minute meetings during the Rumble, my contribution would have been cut in half. Good reminder of <a href="http://www.paulgraham.com/makersschedule.html">makers’ schedules</a>.


3. Don’t rush at the end. We left three hours for testing and padding. We should have left six.


4. Prioritize well. If we had tackled e-commerce on Day 2 (as I almost did), we wouldn’t have finished our core product. Build the Minimum Viable Product first, and then move on to concentric circles of improvement.


5. Small projects can work. I have a bias against small projects; 3 month gigs feel so much more comfortable to me as a consultant than 3 week gigs. But done properly, shorter projects can work fine. We did ~$15,000 worth of work over the course of the weekend. No reason that experience couldn’t translate into a client project.


<h3>Next steps</h3>


So what’s next for ZenVDN? We’d really like to get a few video publishers using it. (Talk to me if you want to be a beta customer.)


And we want to monetize the site, of course.


We think it complements our suite of video-related products well – <a href="http://zencoder.tv/">Zencoder</a> is the core software; <a href="http://flixcloud.com/">FlixCloud</a> makes it an easy web service; and <a href="http://zenvdn.com/">ZenVDN</a> brings video publishing one step closer to the producers.


We have some other ideas for ZenVDN. But if you have an interest in online video, or are a publisher/producer yourself, we’d love to talk more!
      <img src="http://feeds.feedburner.com/~r/RailSpikes/~4/wNpuUdd3sAM" height="1" width="1" alt=""/>

ActiveRecord refererential integrity is broken. Let’s fix it!

ActiveRecord supports cascading deletes to preserve referential integrity:

1
2
3
class User
  has_many :posts, :dependent => :destroy
end

But you really only want cascading deletes about half the time. The other half, you want to actually restrict
deletion of a record with dependencies. ActiveRecord doesn’t support this.

Think of an e-commerce system where a user has many orders. Once an order has gone through, you shouldn’t be able to delete the user who placed the order. You need a record of the order and the user who placed it.

Or even more obvious, think of a lookup table. An Order might have several of these dependencies; OrderStatus, Currency, DiscountLevel, etc. In all of these cases, you want ON DELETE restrict, not ON DELETE cascade. But Rails doesn’t support this. That’s dumb.

If you agree, head on over to the Rails UserVoice site and make your opinion known! There is a ticket for this already. Vote it up if you think Rails should implement this.

The solution to the problem is really pretty simple. ActiveRecord just needs something like this:

1
2
3
class User
  has_many :posts, :dependent => :restrict
end

In this case, if you try to destroy a user that has one or more posts, Rails should complain. You’ve told the app: “Don’t let me delete users who have posts!” The easiest way to do this is to have Rails throw an exception, and have your controller capture the exception and print a flash message. Other approaches could work too.

So why is this important?

1. It’s common. Every project should maintain referential integrity in some way, and :dependent => :destroy isn’t always appropriate. Who wants to do a cascading delete from roles to users, or manufacturers to products, or order_statuses to orders? I don’t think I’ve ever worked on a project where cascading deletes were always appropriate. Any lookup table, at minimum, needs this feature. (I personally prefer to maintain referential integrity with foreign keys, but even still, I’d love to have an application-level check first, which would be easier to rescue. And some projects don’t use foreign keys.)

2. It fits with the Rails philosophy. Rails says “Let your application handle referential integrity, not the database”. But without :dependent => :restrict, one of the most important pieces of referential integrity is missing.

3. It’s easy. 9 lines of code to add this to has_many. Check out this gist: http://gist.github.com/170059.

Someone wrote a plugin for this, but it has the distinct disadvantage of not working anymore. This should really be a core feature anyway, at least as long as :dependent => :destroy is a core feature.

The UserVoice suggestion for this is at http://rails.uservoice.com/pages/10012-rails/suggestions/103508-support-dependent-restrict-and-dependent-nullify.

ActiveRecord refererential integrity is broken. Let’s fix it!

        ActiveRecord supports cascading deletes to preserve referential integrity:
1
2
3
class User
  has_many :posts, :dependent => :destroy
end
But you really only want cascading deletes about half the time. The other half, you want to actually <a href="http://rails.uservoice.com/pages/10012-rails/suggestions/103508-support-dependent-restrict-and-dependent-nullify"><em>restrict</em></a>

deletion of a record with dependencies. ActiveRecord doesn’t support this.

Think of an e-commerce system where a user has many orders. Once an order has gone through, you shouldn’t be able to delete the user who placed the order. You need a record of the order and the user who placed it.


Or even more obvious, think of a lookup table. An Order might have several of these dependencies; OrderStatus, Currency, DiscountLevel, etc. In all of these cases, you want <code>ON DELETE restrict</code>, not <code>ON DELETE cascade</code>. But Rails doesn’t support this. That’s dumb.


If you agree, head on over to the Rails UserVoice site and make your opinion known! <a href="http://rails.uservoice.com/pages/10012-rails/suggestions/103508-support-dependent-restrict-and-dependent-nullify"><div class="post-limited-image"><img src="http://feeds.feedburner.com/~r/RailSpikes/~4/y1Nno8ZfSR0" height="1" width="1" alt=""/></div><!--more--> is a ticket</a> for this already. Vote it up if you think Rails should implement this.


The solution to the problem is really pretty simple. ActiveRecord just needs something like this:
1
2
3
class User
  has_many :posts, :dependent => :restrict
end
In this case, if you try to destroy a user that has one or more posts, Rails should complain. You’ve told the app: “Don’t let me delete users who have posts!” The easiest way to do this is to have Rails throw an exception, and have your controller capture the exception and print a flash message. Other approaches could work too.


So why is this important?


1. It’s common. Every project should maintain referential integrity in some way, and :dependent =&gt; :destroy isn’t always appropriate. Who wants to do a cascading delete from roles to users, or manufacturers to products, or order_statuses to orders? I don’t think I’ve ever worked on a project where cascading deletes were always appropriate. Any lookup table, at minimum, needs this feature. (I personally prefer to maintain referential integrity with foreign keys, but even still, I’d love to have an application-level check first, which would be easier to rescue. And some projects don’t use foreign keys.)


2. It fits with the Rails philosophy. Rails says “Let your application handle referential integrity, not the database”. But without :dependent =&gt; :restrict, one of the most important pieces of referential integrity is missing.


3. It’s easy. 9 lines of code to add this to has_many. Check out this gist: <a href="http://gist.github.com/170059">http://gist.github.com/170059</a>.


Someone wrote a <a href="http://github.com/scambra/dependent_protect/tree/master">plugin</a> for this, but it has the distinct disadvantage of not working anymore. This should really be a core feature anyway, at least as long as <code>:dependent => :destroy</code> is a core feature.


The UserVoice suggestion for this is at <a href="http://rails.uservoice.com/pages/10012-rails/suggestions/103508-support-dependent-restrict-and-dependent-nullify">http://rails.uservoice.com/pages/10012-rails/suggestions/103508-support-dependent-restrict-and-dependent-nullify</a>.
      <img src="http://feeds.feedburner.com/~r/RailSpikes/~4/y1Nno8ZfSR0" height="1" width="1" alt=""/>

Rails 2.3.3 upgrade notes: rack, mocha, and _ids

I upgraded two apps to Rails 2.3.3 today. It’s a minor release, and there’s not much to report. But I did run into three minor problems.

Mocha

Mocha 0.9.5 started throwing an exception:

NameError: uninitialized constant Mocha::Mockery::ImpersonatingAnyInstanceName

A quick update to Mocha 0.9.7 cleared this up.

Array parameters in tests

In functional tests with Test::Unit, passing an array to a parameter stopped working. Previously, I had something like this:


post :create, :user => {:role_ids => [1,2,3]}

This would post the following parameters:


"role_ids"=>["1", "2", "3"]

But after the 2.3.3 update, I started seeing an error:

NoMethodError: undefined method `each' for 1:Fixnum

I’m not sure why this stopped working. (Anyone know?) Changing the integers to strings clears up the error:


post :create, :user => {:role_ids => ["1","2","3"]}

Or


post :create, :user => {:role_ids => [1.to_s,2.to_s,3.to_s]}

Rack

Rack apparently no longer comes bundled with Rails. Or at least deployment failed on cap deploy: RubyGem version error: rack(0.4.0 not ~> 1.0.0).

The solution was simple: install (or vendor) Rack 1.0.0.


config.gem 'rack', :version => '>= 1.0.0'

Rails 2.3.3 upgrade notes: rack, mocha, and _ids

        I upgraded two apps to <a href="http://weblog.rubyonrails.org/2009/7/20/rails-2-3-3-touching-faster-json-bug-fixes">Rails 2.3.3</a> today. It’s a minor release, and there’s not much to report. But I did run into three minor problems.


<h4>Mocha</h4>


Mocha 0.9.5 started throwing an exception:


<code>NameError: uninitialized constant Mocha::Mockery::ImpersonatingAnyInstanceName</code>


A quick update to Mocha 0.9.7 cleared this up.


<h4>Array parameters in tests</h4>


In functional tests with Test::Unit, passing an array to a parameter stopped working. Previously, I had something like this:

post :create, :user => {:role_ids => [1,2,3]}
This would post the following parameters:

"role_ids"=>["1", "2", "3"]
But after the 2.3.3 update, I started seeing an error:


<code>NoMethodError: undefined method `each' for 1:Fixnum</code>


I’m not sure why this stopped working. (Anyone know?) Changing the integers to strings clears up the error:

post 
Continue reading "Rails 2.3.3 upgrade notes: rack, mocha, and _ids"

Estimating software: a rule of thumb

Estimating software is hard, but most of us have to do it – whether we’re estimating an entire project for a client, or a new feature for a boss, or a change to one of our own projects.

I’ve found the following rule helpful when estimating software. This comes from about four years of estimating Rails projects to consulting clients, and moving from bad – dramatically underestimating fixed-bid projects – to pretty good – usually overestimating time & materials projects slightly. (And more importantly, knowing when I can’t estimate, because the scope is too vague or too large.)

Jon’s Law of Estimates

Software difficulty is primarily determined by volume, logic, and integration.

Jon’s Law of Estimates, explained

1. Volume is easy to understand. If you’re building software that does more, it will require more work. So if you’re estimating a project that stores recipes, and you’re estimating another project that stores recipes AND shopping lists, you can expect that the second one will take more work (if everything else is equal).

2. Logic refers to the rules or business logic behind a feature. The more rules there are, the more work there is. Imagine that our recipe system requires that recipes from some users are manually approved by an administrator, and checks to see that each ingredient in the recipe is present in the step-by-step instructions, and only allows a user to post 3 recipes per hour, and lets users propose alternative versions of a recipe, and lets an alternative version replace the regular version if it achieves a certain rating, etc. That’s more work than a recipe system that just lets users create and rate recipes, even though the volume of features may not be any larger.

Interestingly, a technology can make some logic trivial and some logic hard. Nested forms are a great example of this. Before Rails 2.3, Rails made it trivial to do CRUD on a single table at a time, but difficult handle multiple tables. Now it is (almost) trivial to do CRUD on multiple tables at a time.

3. Integration points are usually deserving of special consideration in an estimate. This includes talking to a web services API, another local software system, a data feed, a complex library, etc. Not only do integration points often take time to get right, but they can become sinkholes of time when the documentation is inadequate or incorrect, the other system doesn’t play nice, or you can’t easily test the integration. And your estimate depends on something out of your control: the other system.

External factors

These rules only apply to the difficulty of the software. Several external factors are important as well. These include, most notably, the client and the team. The client can make a project easy, or they can make a project difficult. Similarly, the right team might be able to blaze through a project quickly, while the wrong team may never finish at all.

The other side of estimating

Here’s the thing about these rules: they’re relative, not absolute. There is no rule that says “Features take 5 days, and integration points take 10”. So estimating requires comparisons. This means that if you’ve never built a Rails app before, you’ll have trouble estimating a Rails project. But once you’ve built a few, you can compare the volume, logic, and integration points of a new project to volume, logic, and integration points of the previous ones.

So estimating requires intuition and experience as well as analysis (e.g. Jon’s Law of Estimates). The key to estimating is to combine analysis and intuition, and to let each side refine the other.

Music and programming: interviews with Chad Fowler and Dave Thomas

I’ll be speaking at RailsConf 2009 this year on music and software development (Five musical patterns for programmers). The basic premise is that software development and music actually have quite a bit in common. This may be surprising to some people, who see programming as a cold, rational left-brain sort of thing, like science. But we programmers know that this is not really the case at all.

So as a prelude to my talk, I decided to interview two programmer-musicians on the subject: Chad Fowler and Dave Thomas. Both compose and perform music, and both are noted programmers. Here is the interview.

Rail Spikes: Tell us a little about your background with both programming and music.

Chad Fowler: I started my professional life as a saxophonist in Memphis. I played the Beale street clubs and all the typical Memphis professional musician stuff. Among others, I played for a while with Ann Peebles and her husband Don Bryant with the rhythm section from all the old Hi Records recordings. I did mostly R&B and jazz professionally but I was probably most well known in the Memphis community for making “strange” music. Before playing music professionally, I played guitar in punk bands in high school. I was a fan of punk, heavy metal, hip hop, pop, (new) classical and pretty much everything else. As I immersed myself in the world of jazz, it became quickly clear that the jazz community doesn’t like punk and other less “serious” types of music and has an almost religious negative reaction to jazz musicians who do.

It was almost as if any deviation from the “normal” world of jazz made you a traitor. So I did the natural thing: started a group called The Jazz Traitors, which played music that 1) we loved and 2) offended the jazz community (not necessarily in that order).

I was also very interested in composing “classical” music. I studied with a composer named Kamran Ince, who is still my favorite such composer.

As for programming, I’ve been interested in programming since I was a young child using my commodore 64. I wasn’t really that good at it as a kid but I played around a lot. I didn’t get serious until I picked up programming again as a hobby while I was a professional musician. After a late night gig at a bar, it was relaxing to go home and unwind to some C programming tutorials. I didn’t have a need to program, nor did I have a project in mind (except that I have always loved video games and wanted to learn how they worked). But I got so into it, that I ended up getting a job in computer support because a friend filled out an application for me.

Being the gamer I am, as soon as I started in computer support, I naturally wanted to “level up”. That meant becoming a network administrator. Then a system administrator. Then a programmer, then a designer, then an architect, then a CTO, etc. Now here I am. It’s been fun.

Dave Thomas: There was always a lot of music in our house. My father liked to play the piano and the organ (I learned to solder as he built a Heathkit organ from a kit in the late 60s). My mother liked Broadway musicals. So we’d often experience alternating hours of Chopin and South Pacific. My brother was also musical. I wasn’t particularly, but I enjoyed noodling on the piano, and spent hours just playing with chords and progressions.

I’ve been programming since I was 15 or so.

Rail Spikes: Some developers – yourself included – have suggested a similarity between programming and music composition or performance. How exactly are music and programming similar?

Dave Thomas: I’m not sure, but I think it might be something to do with the discovery of patterns. Both music and code consist of nested sets of variations and repetitions. There’s a rythm to executing code, in the same way there’s a rythm to music. It is never exact, but it’s there. After a while, I found I could imagine the rythm and structure of my programs as they run, in the same way you can pick apart the structure of a piece of music as you listen to it. And, jsut as with music, it takes experience to be able to feel the deeper structures and notice the more extreme variations. But being able to spot them in programs makes coding simpler and more interesting. The basic coding structures—loops, method calls, and so on—provide the framework for composing in the same way that staff and bar lines do for music. Algorithms are like the progressions, and data becomes the notes. And in the same way that good music takes all these things and then surprises you, good code does the same thing. It isn’t mechanical and repetitive: instead it uses the constraints to build something bigger and more interesting.

Chad Fowler: It’s hard for me to put my finger on. There’s something similar in the way I think when I do each.

I think it all boils down to language, though. In all of these cases (including learning actual language), you take a bunch of tokens (notes, sounds, grunts, functions, classes) and combine them into a grammar which you use to express ideas. The way you do that is totally up to you as long as the intended ideas are communicated. With computer programs, they have to do what they’re meant to do. With music, they express or evoke emotions, paint pictures, cause anxiety or whatever.

Some computer programs evoke emotions and cause anxiety as well.

Rail Spikes: Is Ruby development more like improvised jazz or composed classical music?

Chad Fowler: I think it’s both. And I don’t think Ruby is any different in this than other languages. Much of the discussion about the relationship between programming and music focuses on the more obvious idea of programming as composition. It makes sense, since programmers tend to sit and type their ideas into an editor and then eventually execute it. The programs can be checked, tested, refactored, etc. before the actual performance. This is how classical composition works as well.

But the less obvious angle is that in many situations, programming is like performance. In fact, even in music, improvisation is really just real time composition. You don’t get a chance to refactor because your “code” is executed as you write it.

I’ve had this same feeling while debugging production problems, hacking new features on a tight deadline, or sometimes during the initial creation of an application. The same synapses are firing as when I was trying to play Cherokee at 200 beats per minute. Mistakes can’t be erased, so they have to be nuanced into (worst case) insignificant events or (best case) important drivers behind the work.

From a purely development-oriented perspective, TDD is more like improvisation than composition. I think that’s what I like about it. It’s motivating and creative in an exciting, time-sensitive way. You take small steps and see where they lead you. Sure, you can always revert your changes if you paint yourself into a corner but part of the fun and challenge is to not paint yourself into a corner.

One thing jazz musicians like to say is that every wrong note is just a half step away from a right note. TDD is like that. You might take a slightly wrong turn. It’s fun to see if you can course-correct without starting over.

Rail Spikes: Do developers need to be musically inclined? Does it help?

Chad Fowler: Obviously not. Some of the best programmers I know are not musicians. I can’t tell if it helps, but I would guess that developers who are also musicians are different than developers who aren’t. I don’t think that’s because being a musician changes people, though. I think it’s because the people who are both are the kind of people who need to do both.

This usually means they’re “right brain” people. This leads to a way of thinking that changes how they approach programming problems.

I think learning music (or another right brain discipline) is a good way to exercise your mind. So I wouldn’t be surprised if leaning music helps people exercise their thought processes in ways that will benefit their work as programmers (or authors, or lawyers, or doctors or whatever).

I also think, though, that if we were all musicians at heart, we wouldn’t get much done. I rely heavily on my less artsy colleagues to ground me and be sometimes more pragmatic than I am. So I don’t think we all need to be a “right brain” programmer. It would be disasterous if we were.

Dave Thomas: Do they need to be? No. But many of the good ones I know are. I’d guess that density of musicians in software development is many times the population norm. But that means you could also ask the question “Do musicians have to know software development?”

I think the more interesting question is to ask “how can people best express what they enjoy doing?” because both music and software development are outlets for this.

Rail Spikes: What sort of music do you listen to? Any recommendations for Ruby developers looking to expand their musical horizons?

Chad Fowler: As I mentioned earlier, I like all kinds of music (with a few exceptions). Lately I’ve been listening to a lot of instrumental hip hop, such as DJ Qbert and Mixmaster Mike. I’ve also been getting into a genre of electronic music called “electro”, which sounds like the bleeps and bloops that are the soundtrack of my dreams (if a computer is going to generate music I always like it to sound like a computer generated it).

As for recommendations, here are a few ideas for things that most developers probably haven’t listened to:

  • Kamran Ince – He was my composition teacher and, I think, an accessible introduction to the world of “new music”, which is what we call new composed “classical” music. The term “classical” is a widely spread misnomer. It actually refers to music written in the late 18th and early 19th centuries, but most people use it to mean high brow music written for instruments like violins. So whatever you call it, Kamran Ince writes some beautiful instances of it. Specifically check out his chamber music, such as Domes and Arches.
  • Charlie Wood – I have had the pleasure of playing with Charlie on a few occasions. He is a R&B singer/organist/composer from Memphis and writes some of the most intelligent songs you’ll hear. My favorite album of his is “Who I Am”.
  • John Zorn – Zorn has been around for a long time and is a leader in the world of Avant Garde music. He’s also one of the most amazing saxophonists ever. If you’re new to this kind of thing, his Masada quartet (“radical Jewish music”) produces some great stuff that’s accessible to first time listeners. If you’re looking for something to shock your aural taste buds, try Painkiller (metal-tinged noise) or Naked City.

Dave Thomas: I listen to just about anything that’s interesting. My playlist here is very varied, and I try to add new stuff to it farily regularly. I know people who are trained as musicians, and I tend to ask them what they’re listening to. Sometimes that leads to challenges: my ear isn’t as developed as their ears. But often it leads to whole new areas of cool stuff. So I’d recommend everyone should find a friend who knows more than you do about music and ask them to surprise and challenge you. (That advice probably applies to just about everything, thinking about it.) It’s easy to find music that stimulates your lizard brain. Get into the habit of looking for the stuff that engages at a higher level too. And, like everything, have fun with it.

Anonymize sensitive data with rake

When troubleshooting a nasty bug, it’s often useful to take a look actual production or staging data, or even pull it down into your development database. But this is a huge potential privacy and security concern. Your local environment likely isn’t as secure as your production environment, and you might not want to access this sensitive data (or give it to another team member).

Similarly, you might want to replicate your production data on a staging or QA environment to see how new code will interact with real data. Also a privacy concern.

Simple solution: anonymize the data!

In my current project, I put together an anonymize.rake task to deal with this. The most sensitive data in our app is name and phone number. Without that, private information can’t really be linked back to someone. So I pulled the 200 most common first names and 1000 most common last names (in the United States) and put them into an Anonymizer class. Call Anonymizer.random_name for a random, but realistic, name. The class also includes a simple phone number and email anonymizer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Anonymizer
  def self.random_name
    "#{random_first_name} #{random_last_name}"
  end
  
  def self.random_first_name
    FIRSTNAMES[rand(FIRSTNAMES.size)]
  end
  
  def self.random_last_name
    LASTNAMES[rand(LASTNAMES.size)]
  end
  
  def self.random_phone
    "612-555-#{rand(8000) + 1000}"
  end
  
  FIRSTNAMES = %w(James
  John
  Robert
  Michael

  # etc.

The rake task is simple:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
namespace :db do
  namespace :data do
    desc "Anonymize sensitive information"
    task :anonymize => :environment do
      if RAILS_ENV == 'production'
        puts "Refusing to anonymize production data. You don't really want to do that."
      else
        puts "Anonymizing all name and email records in the #{RAILS_ENV} database."
        
        # User.find(:all).each do |user|
        # user.name = Anonymizer.random_name
        # user.email = Anonymizer.random_email(user.name)
        # puts "Saving #{user.name} (#{user.email})"
        # user.save!
        # end
      end
    end
  end
end

You’ll need to do the actual implementation yourself (see the sample User.all.each {} block). It would be easy enough to extend this to work with social security numbers, addresses, etc. Run with:

rake db:data:anonymize

Code: anonymize.rake

Benchmarking your Rails tests (updated)

Update: stubbing a single integration point shaved 22 seconds off of my unit tests, reducing test time from 35 seconds to 13. See below.

The first step to faster tests is knowing what is slow. Fortunately, this is dead simple with the test_benchmark plugin by Tim Connor, and originally built by Geoffrey Groschenbach. Install the plugin, and when you run your tests via Rake, you’ll see handy output showing you the slowest tests, and the slowest test classes.

Step 1: Install the plugin.

script/plugin install git://github.com/timocratic/test_benchmark.git

Step 2: Run your tests

rake test

Here is a bit of output when I run the unit tests for FanChatter:

Finished in 34.838173 seconds.

Test Benchmark Times: Suite Totals:
25.393 MailReceiverTest
4.520 PhotoTest
1.429 REXMLTest
0.961 TeamTest
0.846 MessageTest

Pretty useful information. Almost 75% of our unit testing time is taken up in the MailReceiverTest. So if we want to speed up our tests, we need to make our MMS testing faster. Looking at that code, I see this line over and over:


MailReceiver.receive(fixture_mms(:fixture_name))

This method reads a test email message from the filesystem, and runs it through our mail parsing method. This is basically an integration test, hitting at least two integration points. So if we can remove these bottlenecks, we can reasonably expect a fairly large improvement in our unit test speed.

I think we could realistically reduce our unit testing time from 34 seconds to <15>

Other options

The test_benchmark plugin fires whenever you run your tests with rake. Tim recently patched the plugin to not fire when run with autotest, which is great. Personally, though, I don’t want to see this benchmark information every time I run my tests. So I added the following line to my test.rb environment file:

ENV['BENCHMARK'] ||= 'none'

Now, the benchmarks don’t run by default. If I want to see them, I call:

rake test BENCHMARK=true

And if to see full tests, showing the time it takes to run every test in the system, just call:

rake test BENCHMARK=full

That’s it. You still have to speed up your tests, and there are many ways to do that (from mocking to simply reducing the number of calls to expensive methods), but knowing what’s slow is half the battle.

The stirring conclusion (update)

I spent a few minutes optimizing these slow tests today. First, I tried rearranging the tests to reduce unnecessary calls to the slow method (MailReceiver.receive(message)). I was able to speed MailReceiverTest from about 25 seconds to 17. Not bad, but still slow.

The real problem is that this method saves a photo. It creates a Photo record that includes a file, treated sort of like an upload, like this:

1
photo.uploaded_data = mms.file

This is what was slow. But my unit tests don’t actually deal with the file being saved to the filesystem; they test other things, like the right records being created, confirmation emails being sent, etc.

So I decided to try bypassing this file save/upload by stubbing the uploaded_data= method. I put the following at the top of my test class:

1
2
3
def setup
    Photo.any_instance.stubs(:uploaded_data=)
  end

And voila! MailReciverTest went from 25 seconds to 17 seconds to 3 seconds.

Slow tests are a bug

I’ve been doing TDD for about three years now. Once I figured out how to do it right, it became a natural part of how I program, and I can’t really imagine doing development without it. This isn’t to say that TDD is the only approach to writing quality software or that unit testing it the only kind of testing that matters. But it sure is useful.

The Ruby world talks a lot about TDD, moreso than many other developer communities. We have not one, not two, but at least half a dozen testing libraries that are actively being used and developed. For most Ruby developers, the question isn’t “Do you test?” but “BDD or TDD?” or even “RSpec, Shoulda, or Bacon?” We often use at least 2-3 layers of automated testing, and sometimes use different tools for each layer. Most Ruby conferences devote at least a few talks each day to testing-related topics. We’re test fanboys and -girls, for better or for worse.

But in spite of this, we rarely talk about test speed. Sure, there are purists who believe that unit tests shouldn’t touch the database because anything that touches the DB is actually an integration test. But few Ruby testers actually take this long and lonely road, and I personally prefer tests that talk to a database, at least some of the time.

And it’s true that others have written libraries to distribute their tests across multiple machines. But that’s the exception that proves the rule – the only reason to distribute your tests is that they’re too slow to begin with.

Most Rails projects I’ve worked on have ended up at around 3,000-15,000 lines of code, with a roughly as many lines of test code, and most have test suites that take a minute or more to run. Our test suite for Tumblon, for instance, churns along for 2.5 minutes. This is a too slow. And slow tests are a problem for at least two reasons: they slow down your development and decrease code quality.

1. Slow tests slow down development. If you’re practicing TDD, you want to see a test fail before you make it succeed. Two minutes is far too long for this feedback loop to be effective. Of course, you can (and should) just run the test classes that correspond to your code as you program – no need to run your entire test suite every time you write your failing tests. But even still, the test time bar should ideally be set quite low. Frequent 5-10 second delays are enough to break my concentration, and I find myself cmd-tabbing over to other programs if I have to wait more than a few seconds for a test to run. I don’t know of any hard-and-fast rules, but I know that as soon as my test suite runs longer than 30-45 seconds, and individual test classes take longer than 2-3 seconds, I’m less happy and less productive.

2. Slow tests decrease code quality. There are two simple reasons for this. First, if slow tests break your flow, you’re not only going to write code more slowly: you’re also going to write worse code. Second, if your tests are too slow, you’re not going to wait for them to finish before you move on to the next task. Or worse, you’re not going to run them at all.

So, how can I speed up my tests?

Fortunately, this problem can be addressed. There are plenty of ways to speed up tests. On a current project, we’ve managed to cut our test time substantially – a recent test refactoring cut test time from 129.45 seconds to 31.04 seconds, without removing any tests. That’s a 76% speedup. But we still have room for improvement.

Really quickly, here are at least five ways to speed up your test suite. I hope to post more on each of these over the next month or two.

1. Use a test database instead of fixtures/factories/etc.

2. Only touch the database when necessary

3. Organize your tests to avoid duplicate execution

4. Separate slow tests out into a lazier testing layer

5. Run a Rails test server

I’d love to see the Rails community devote more of its enthusiasm for testing to the question of test speed. There’s nothing wrong with improving our test frameworks, and let’s keep doing that. But let’s also make these frameworks fast.

Rescuing autotest from a conflicting plugin

For the longest time, I wasn’t able to run autotest on one of my projects. That was OK; I was intrigued by autotest, but had never really committed to it. The problem: whenever I would try to run autotest, I’d get the following error:


loading autotest/rails_rspec
Autotest style autotest/rails_rspec doesn't seem to exist. Aborting.

I’m running Shoulda, not RSpec, so I had no idea why this was happening. I tried installing (and uninstalling) RSpec in various configurations, to no avail. Nothing worked.

Then I started a new project. Autotest worked just fine on it. After a few days, I got used to autotest, and a few days later, I came to really like it. It helps me get into a TDD “flow” – all tests pass; write failing tests; write code; all tests pass.

So when I came back to my previous project where autotest didn’t work, I decided to dig deeper. Eventually I found a plugin that was causing the problem: acts-as-taggable-on. The plugin was written to allow autotesting, as explained in a blog post. Supposedly, this is supposed to be a different autotest instance from your app’s main instance, but it wasn’t working that way for me.

The fix? Delete lib/discover.rb from the acts-as-taggable-on plugin. That’s it – autotest works now.

In the end, I maybe could have solved the problem by getting RSpec configured properly, but just running the gem locally didn’t do the trick for me, and I don’t want to add any code to my app to support autotesting of a plugin that I never want to test.

So should plugins even ship with test code? Yes, they should. Not for normal use; I never run plugin tests, assuming instead that the plugin is tested by the author. But if an open source plugin ships without tests, it’s that much harder for other developers to fork/fix/improve the plugin. But really, that’s about the only reason for plugin/gem tests. And they should never touch application tests.