Counters Everywhere, Part 2

In Counters Everywhere, I talked about how to handle counting lots of things using single documents in Mongo. In this post, I am going to cover the flip side—counting things when there are an unlimited number of variations.

Force the Data into a Document Using Ranges

Recently, we added window and browser dimensions to Gaug.es. Screen width has far fewer variations as there are only so many screens out there. However, browser width and height can vary wildly, as everyone out there has there browser open just a wee bit different.

I knew that storing all widths or heights in a single document wouldn’t work because the number of variations was too high. That said, we pride ourselves at Ordered List on thinking through things so our users don’t have to.

Does anyone really care if someone visited their site with a browser open exactly 746 pixels wide? No. Instead, what matters is what ranges of widths are visiting their site. Knowing this, we plotted out what we considered were the most important ranges of widths (320, 480, 800, 1024, 1280, 1440, 1600, > 2000) and heights (480, 600, 768, 900, > 1024).

Instead of storing each exact pixel width, we figure out which range the width is in and do an increment on that. This allows us to receive a lot of varying widths and heights, but keep them all in one single document.

{
  "sx" => {
    "320"  => 237,
    "480"  => 367,
    "800"  => 258,
    "1024" => 2273,
    "1280" => 10885,
    "1440" => 6144
    "1600" => 13607,
    "2000" => 2154,
  },
  "bx" => {
    "320"  => 121,
    "480"  => 390,
    "800"  => 3424,
    "1024" => 9790,
    "1280" => 11125,
    "1440" => 3989
    "1600" => 6757,
    "2000" => 301,
  },
  "by" => {
    "480"  => 3940,
    "600"  => 13496,
    "768"  => 8184,
    "900"  => 6718,
    "1024" => 3516
  },
}

I would call this first method for storing a large number of variations cheating, but in this instance, cheating works great.

When You Can’t Cheat

Where the single document model falls down is when you do not know the number of variations, or at least know that it could grow past 500-1000. Seeing how efficient the single document model was, I tried to store content and referrers in the same way, initially.

I created one document per day per site and it had a key for each unique piece of content or referring url with a value that was an incrementing number of how many times it was hit.

It worked great. Insanely small storage and no secondary indexes were needed, so really light on RAM. Then, a few larger sites signed up that were getting 100k views a day and had 5-10k unique pieces of content a day. This hurt for a few reasons.

First, wildly varying document sizes. Mongo pads documents a bit, so they can be modified without moving on disk. If a document grows larger than the padding, it has to be moved. Obviously, the more you hit the disk the slower things are, just as the more you go across the network the slower things are. Having some documents with 100 keys and others with 10k made it hard for Mongo to learn the correct padding size, because there was no correct size.

Second, when you have all the content for a day in one doc and have to send 10k urls plus page titles across the wire just to show the top fifteen, you end up with some slowness. One site consistently had documents that were over a MB in size. I quickly realized this was not going to work long term.

In our case, we always write data in one way and always read data in one way. This meant I needed an index I could use for writes and one that I could use for reads. I’ll get this out of the way right now. If I had it to do over again, I would definitely do it different. I’m doing some stupid stuff, but we’ll talk more about that later.

The keys for each piece of content are the site_id (sid), path (p), views (v), date (d), title (t), and hash (h). Most of those should be obvious, save hash. Hash is a crc32 of the path. Paths are quite varying in length, so indexing something of consistent size is nice.

For writes, the index is [[‘sid’, 1], [‘d’, -1], [‘h’, 1]] and for reads the index is [[‘sid’, 1], [‘d’, -1], [‘v’, -1]]. This allows me to upset based on site, date and hash for writes and then read the data by site, date and views descending, which is exactly what it looks like when we show content to the user.

As mentioned in the previous post, I do a bit of range based partitioning as well, keeping a collection per month. Overall, this is working great for content, referrers and search terms on Gaug.es.

Learning from Mistakes

So what would I do differently if given a clean slate? Each piece of content and referring url have an _id key that I did not mention. It is never used in any way, but _id is automatically indexed. Having millions of documents each month, each with an _id that is never used starts to add up. Obviously, it isn’t really hurting us now, but I see it as wasteful.

Also, each document has a date. Remember that the collection is already partitioned by month (i.e.: c.2011.7 for July), yet hilariously, I store the full date with each document like so: yyyy-mm-dd. 90% of that string is completely useless. I could more easily store the day as an integer and ignore the year and month.

Having learned my lesson on content and referrers, I switched things up a bit for search terms. Search terms are stored per month, which means we don’t need the day. Instead of having a shorter but meaningless _id, I opted to use something that I knew would be unique, even though it was a bit longer.

The _id I chose was “site_id:hash” where hash is a crc32 of the search term. This is conveniently the same as the fields that are upserted on, which combined with the fact that _id is always indexed means that we no longer need a secondary index for writes.

I still store the site_id in the document so that I can have a compound secondary index on site_id (sid) and views (v) for reads. Remember that the collection is scoped by month, and that we always show the user search terms for a given month, so all we really need is which terms were viewed the most for the given site, thus the index is [[‘sid’, 1], [‘v’, -1]].

Hope that all makes sense. The gist is rather than have an _id that is never used, I moved the write index to _id, since it will always be unique anyway, which means one less secondary index and no wasted RAM.

Interesting Finding

The only other interesting thing about all this is our memory usage. Our index size is now ~1.6GB, but the server is only using around ~120MB of RAM. How can that be you ask? We’ve all heard that you need to have at least as much RAM as your index size, right?

The cool thing is you don’t. You only need as much RAM as your active set of data. Gaug.es is very write heavy, but people pretty much only care about recent data. Very rarely do they page back in time.

What this means is that our active set is what is currently being written and read, which in our case is almost the exact same thing. The really fun part is that I can actually get this number to go up and down just by adjusting the number of results we show per page for content, referrers and search terms.

If we show 100 per page, we use more memory than 50 per page. The reason is that people click on top content often to see what is doing well, which continually loads in the top 100 or 50, but they rarely click back in time. This means that the active set is the first 100 or 50, depending on what the per page is. Those documents stay in RAM, but older pages get pushed out for new writes and are never really requested again.

I literally have a graph that shows our memory usage drop in half when we moved pagination from the client-side to the server-side. I thought it was interesting, so figured I would mention it.

As always, if you aren’t using Gaug.es yet, be sure to give the free trial a spin!

In The Jungle, The Mighty Jungle

A few quick notes about Lion, mostly first impressions and things I haven’t necessarily seen a ton of coverage on.

The install itself

This is, I think, my third major OS upgrade since I started using OS X all the time. It’s by far the easiest install. The biggest problem was the time it took to download the installer. I also had a problem where the XCode installer finished but didn’t register itself as finished, such that it appeared to hang. That took a little effort to track down, but wasn’t actually damaging.

Beyond that, though, nearly everything just worked. I had to reinstall exactly one Unix-y binary (ImageMagick, ever the outlier when dealing with annoying installations). I was afraid that I’d need to fuss with things like MySQL or my Ruby install, but by and large, I didn’t.

That Darn Scrolling

The most obvious change as you go to Lion is the new scrolling – it’s such a big deal, that Apple even gives you a dialog box on your first Lion boot reminding you that things have changed.

So, is the change irritating or merely annoying? Okay, for the first hour or so, it was completely unmanageable, to the point where I could feel the tension in my wrists from straining to remember which way to push the track pad.

At the risk of being obvious, what’s happening here is a metaphor shift. Rather than imagining that scrolling is the act of moving a window over your document up and down – moving the pointer down moves the window down and shows you a lower part of the document you now imagine that you are moving the document itself so moving the pointer down drags the document down and shows you a higher part of the document. As pretty much everybody has noted, this is exactly how scrolling works on iOS devices, where you the metaphor of dragging the document itself is much more concrete. (Weirdly, I used iOS devices for quite a while before I consciously started to think about how iOS was backward relative to the Mac.)

After a few days with the new scrolling I’ve basically got it. I find that if I don’t look at the scrollbars when I scroll, it’s much easier to imagine that I’m dragging the document. Also, for some reason, I took to the new scrolling most quickly in minimal apps or apps that are very similar to their iOS versions, such as Reeder. And I can’t seem to get it right in iTunes for some reason.

Is this change a good thing? Dunno. It’s clearly a thing. It’s a little weird to have something as fundamental as scroll direction be subject to user whim – I expect it’ll make pair programming interesting if it really becomes something that a significant minority of users don’t switch to the new version. There’s one problem with the new system that I think is unambiguously bad – since the scroll bars fade to the background when they aren’t used, it’s much harder to see how large a document is and where you are in the document from a quick glance. That’s a loss of information that doesn’t seem to be counterbalanced by anything. It’s also weird that you can still grab the scroller itself, and move it in the traditional direction (although since the scroller is now on top of the view in many apps, it’s sometimes hard to actually grab something at the edge of a document). My overall feeling is that this would make total sense if we had been doing it for fifteen years, but right now it’s going to feel weird for a while.

Another thing is that if the document is scrollable in two directions, it seems to be much harder to keep a pure vertical scroll without it drifting into a slight horizontal scroll. Also, I can’t imagine this working if you were using a mouse instead of a trackpad.

Overall look and feel

Broadly, it seems like there are three overlapping mandates for the look and feel changes in Lion – make interface elements less prominent (with the glaring exception of iCal), incorporate successful features from iOS, and animate anything that’s not nailed down. So scrollbars and other basic interface elements have become more muted across the board. Those changes are not dramatic, but I like them, they do tend to keep focus where it belongs.

I really like Mission Control as a re-imagining of Spaces/Dashboard/Expose. The Mission Control screen is very nice, easy to see, and it’s very easy to manipulate spaces – this is one case where the gestures really work. (I never was able to stick to using spaces before, but I have been using them a bit in Lion). The new full-screen mode doesn’t work for me, mostly because I’m often in a dual monitor situation, and the second monitor is ignored in full-screen, which seems a waste, but I can see how making each full-screen app its own space makes dealing with a bunch of full-screen apps much easier. Launchpad seems to be something that I don’t need, and it feels like it would be hard to manage.

The animations don’t bother me as much as they seem to bother other people – though ask me again about Mail.app in a few weeks. I’ve seen some complaints about the speed of the animation between spaces, but it seems reasonable to me.

That said, the iCal redesign doesn’t do much for me, but I’m not a heavy iCal user. Address book I like better, though I still think it’s a little hard to use. One feature that I do like is that, if you use iPhoto’s faces feature, Address Book can easily search iPhoto for pictures matching the name of the contact to use as the avatar for that contact.

Auto-Save and Versions

One of the biggest functional features of Lion is the auto-save and versioning. Lion-native apps auto save when idle or at a timed interval, and automatically save when the app is closed. They also automatically restore state when the app reopens. Apps have a Time Machine like interface to view old versions of the same document. Points:

  • Basically, this is awesome.
  • I think it’s going to be much harder to break my typing tic of pressing command-s at the end of every sentence then it is to adjust to the scrolling thing.
  • I also think it’s going to take some time to get used to the new “Save a Version/Duplicate/Revert To Saved” wording in the file menu.

One thing I haven’t seen commented on is that there seem to be two different kinds of version support in Lion. Which may mean that I’m getting this wrong. But it appears that there is a subtle difference between apps like Pages, Keynote, and other applications. In Pages and Keynote, you have access to every save point over the history of the document.

For other applications, if you are connected to your Time Machine drive, you have all time machine snapshots. If you aren’t you seem to have access to maybe the most recent time machine snapshot. I’m not 100% sure exactly what’s going on and I’m not sure yet if it’s an app thing or a document type thing – for example, it seems like Apple’s TextEdit can create multiple versions of a text document, but, for example, Byword can’t. But Byword is Lion-compatible, in that it has the new-style File menu. Ultimately, as cool as this is theory, it’s a little confusing in implementation.

New Apps

I’ve started playing with Mail.app, which I stopped using about two years ago on the grounds that it was really irritating. It’s a lot better now, with a more useful three-column layout (that can become a two-column layout), conversation threading, a really, really nice search feature and a bunch of animations that straddle the line between charming and annoying. For the record, the popout animation for replying I find a bit much, but they way sent messages fly up off the top of the screen kind of makes me smile. (And if you liked the animation from the App Store where the icon files into the dock, note that Safari uses something similar for downloads, and Mail uses it for replies.)

One nice touch that I haven’t seen called out much is that in Mail and iChat, and I’m not sure where else, inline url’s have a little arrow after them, which triggers a quick look preview of the web page, similar to the way Google’s quick preview works. That’s nice.

Anything else I can think of

Lion has also added a system wide autocorrect clearly based on the iOS version. I thought this was going to bug me, but actually I kind of like it. It appears to work a little better than the iOS version at identifying and correcting actual typos, the UI is a nice combination of unobtrusive but yet making you aware that a change has been made, and it’s much more responsive than TextExpander (which I love for deliberate macros, but which has always felt a little sluggish when correcting typos). Also, the autocorrect has fixed like five typos just in this paragraph, and only got one of them wrong. I’ll actually take those odds.

So

Too many words to say this: I like Lion so far, although some of the specific choices puzzle me cough iCal cough. It’s taken me less time to get used to the changes than I thought, and I’m finding some of the changes making definite improvements in my normal workflow.

Filed under: Mac

How do I test my code with Minitest?

How do I test my code with Minitest?

This guest post is by Steve Klabnik, who is a software craftsman, writer, and former startup CTO. Steve tries to keep his Ruby consulting hours down so that he can focus on maintaining Hackety Hack and being a core member of Team Shoes, as well as writing regularly for multiple blogs.

Steve Klabnik Programming is an interesting activity. Everyone has their favorite metaphor that really explains what programming means to them. Well, I have a few, but here’s one: Programming is all about automation. You’re really just getting the computer to automatically do work that you know how to do, but don’t want to do over and over again.

When I realized this, it made me look for other things that I do that could be automated. I don’t like repeating myself over and over and over again. That’s boring! Well, there’s one particular task that’s related to programming that’s easily made automatic, and that’s testing that your software works!

Does this story sound familiar? You run your program, try a few different inputs, check the outputs, and see that they’re right. Then, you make some changes in your code, and you’d like to see if they work or not, so you fire up Ruby and try those inputs again. That repetition should stick out. There has to be a better way.

Luckily, there is! Ruby has fantastic tools that let you set up tests for your code that you can run automatically. You can save yourself tons of time and effort by letting the computer run thousands of tests every time you make a change to your code. And it’ll never get tired and accidentally type in a 2 when you mean to type 3… Many people take this one step farther. They find testing so important and so helpful that they actually write the tests before they write the code! I won’t expound on the virtues of “test driven development” in this post, but it’s actually easier to write the tests first, once you get some practice at it. So, let’s pick a tiny bit of code to work on, and I’ll show you how to test it using Ruby’s built-in testing library, minitest.

For this exercise, let’s do something simple, so we can focus on the tests. We’ll make a Ruby class called CashRegister. It’ll have a bunch of features, but here’s the first two methods we’ll need:

  • The register will have a scan method that takes in a price, and records it.
  • The register will have a total method that shows the current total of all the prices that have been scanned so far.
  • If no prices have been scanned, the total should be zero.
  • The register will have a clear method that clears the register of all scanned items. The total should go back to zero again.

Seems simple, right? You might even know how to code this already. Sometimes, intermediate programmers practice coding problems that are easy, just to focus on how to write good tests, or to work on getting the perfect design. We call these kinds of problems ‘kata.’ It’s a martial arts thing.

Anyway, enough about all of this! Let’s dig in to minitest. It already comes with Ruby 1.9, but if you’re still using 1.8, you can install it with ‘gem install minitest.’ After doing so, open up a new file, register.rb, and put this in it:

require 'minitest/autorun'

class TestCashRegister < MiniTest::Unit::TestCase
  def setup
    @register = CashRegister.new
  end
  def test_default_is_zero
    assert_equal 0, @register.total
  end
end

Okay! There’s a lot going on here. Let’s take it line by line. On the first line, we have a ‘require.’ The autorun part of minispec includes everything you need to run your tests, automatically. All we need to do to run our tests is to type ruby register.rb, and they’ll run and check our code. But let’s look at the rest of the file before we do that. The next thing we do is set up a class that inherits from one of minitest’s base classes. That’s how minitest works, by running a series of TestCases. It also lets you group similar tests together, and split different ones up into multiple files.

Anyway, enough organizational stuff. In this class, we have two methods: the first is the setup method. This runs before each test, and allows us to prepare for the test we want to run. In this case, we want a new CashRegister each time, and we’ll store it in a variable. Now we don’t have to repeat our setup over and over again… it’s just automatic!

Finally, we get down to business, with the test_default_is_zero method. Minitest will run any method that starts with test_ as a test. In that method, we use the assert_equal method with two arguments. assert_equal is where it all happens, by comparing 0 to our register’s total, and it will complain if they’re not equal.

Okay, so we have our first test. Rock! You might be tempted to start implementing our CashRegister class, but wait! Let’s try running the tests first. We know they’ll fail, because we don’t even have a CashRegister yet! But if we run the tests before writing code, the error messages will tell us what we need to do next. The tests will guide us through the implementation of our class. So, as I mentioned earlier, we can run the tests by doing this:

$ ruby register.rb

We get this as output:

Loaded suite register
Started
E
Finished in 0.000853 seconds.

1) Error:
test_default_is_zero(TestRegister):
NameError: uninitialized constant TestRegister::CashRegister
register.rb:5:in `setup'

1 tests, 0 assertions, 0 failures, 1 errors, 0 skips

Test run options: --seed 36463

Whoah! Okay, so you can see that we had one test, one error. Since we know classes are constants in Ruby, we know that the uninitialized constant error means we haven’t defined our class yet! So let’s do that. Go ahead and stick in an empty class at the bottom:

class CashRegister
end

And run the tests again. You should see this:

1) Error:
test_default_is_zero(TestRegister):
NoMethodError: undefined method `total' for #<CashRegister:0x00000101032a80>
register.rb:9:in `test_default_is_zero'

Progress! Now it says we don’t have a total method. So let’s define an empty one. Modify the class like this:

class CashRegister
  def total
  end
end

And run the tests again. Another failure:

1) Failure:
test_default_is_zero(TestRegister) [register.rb:9]:
Expected 0, not nil.

Okay! No more syntax errors, just the wrong result. Let’s keep it as simple as possible, and fill out a nice and easy total method:

def total
  0
end

Now, you may be saying, “Steve, that doesn’t calculate a total!” Well, you’re right. It doesn’t. But our tests aren’t yet asking to calculate a total, they’re just asking for a default. If we want a total, we should write a test that actually demonstrates adding it up. But we have fulfilled objective #3, so we’re doing good! Now, let’s work on objective #2, since we sorta feel like the total method is lying about what it’s supposed to do. In order to add up the items that were scanned, we need to scan them in the first place! Objective #1 says that this method should be called scan, so let’s write a test. Put it in your test class with the test_default_is_zero method:

def test_total_calculation
  @register.scan 1
  @register.scan 2
  assert_equal 3, @register.total
end

Make sense? We want to scan two things in, and then check that the total is correct. Let’s run our tests!

Loaded suite register
Started
.E
Finished in 0.000921 seconds.

1) Error:
test_total_calculation(TestRegister):
NoMethodError: undefined method `scan' for #<CashRegister:0x00000101031838>
register.rb:13:in `test_total_calculation'

2 tests, 1 assertions, 0 failures, 1 errors, 0 skips

Test run options: --seed 54501

Okay! See that ‘.E’ up there? That graphically shows that we had one test passing, and one test with an error. Our first test still works, but our second is failing because we don’t have a scan method. Add an empty one to our CashRegister class, and run again:

1) Error:
test_total_calculation(TestRegister):
ArgumentError: wrong number of arguments (1 for 0)
register.rb:24:in `scan'
register.rb:13:in `test_total_calculation'

Whoops! It takes an argument. Let’s add that: def scan(price). Run the tests!

1) Failure:
test_total_calculation(TestRegister) [register.rb:15]:
Expected 3, not 0.

Okay! This sounds more like what we expected. Our total method just returns zero all the time! Let’s think about this for a minute. We need to have scan add the price to a list of scanned prices. So we’d better have it do that:

def scan(item)
  @items << item
end

But if you run the tests, you’ll see this:

1) Error:
test_total_calculation(TestRegister):
NoMethodError: undefined method `<<' for nil:NilClass
register.rb:25:in `scan'
register.rb:13:in `test_total_calculation'

Oops! @items is undefined. Let’s make it be an empty array, when we create our register:

def initialize
  @items = []
end

And run the tests:

1) Failure:
test_total_calculation(TestRegister) [register.rb:15]:
Expected 3, not 0.

Okay! We’re back to our original failure. But we’ve made some progress: now that we have an actual list of items, we’re in a position to make our total method work. Also, at each step, even though one test was failing, the other was still passing, so we know that we didn’t break our default functionality while we were working on getting a real total going.

Now, we’re in a better place to calculate the total:

def total
  @items.inject(0) {|sum, item| sum += item }
end

Or, if you want to make it even shorter:

def total
  @items.inject(0, &:+)
end

If you’re not familiar with Enumerable#inject, it takes a list of somethings and turns it into a single something by means of a function, in a block. So in this case, we can keep a running sum of all items, and then add the price of each one to the sum. Done! Run your tests:

Started
..
Finished in 0.000762 seconds.

2 tests, 2 assertions, 0 failures, 0 errors, 0 skips

Woo hoo! We’re done! Our total can now be calculated. Great job!

Now, here’s a challenge, to see if you’ve really learned this stuff: write a test for a new method, clear, that clears the total. That’s objective #4 we talked about above.

Other parts of minitest

This has been a mini intro to minitest and using it to test your code. There are other methods in the assert family, too, like assert_match, which takes a regular expression and tries to match it against something. There’s the refute family of tests, which are the opposite of assert:

assert true #=> pass
refute true #=> fail

There are also other tools that make minitest useful, like mocks, benchmark tests, and the RSpec-style ‘spec’ syntax. Those will have to wait until later! If you’d like to learn about them now, check out the source code on GitHub.

Happy testing!

I hope you found this article valuable. Feel free to ask questions and give feedback in the comments section of this post. Also, do check out Steve’s other article: “How do I keep multiple Ruby projects separate?” on RubyLearning. Thanks!

Technorati Tags: , ,

Slim, rack-webconsole, concurrency in JRuby

In this episode, Peter and Jason bring you Slim, a new templating engine, rack-webconsole, and a great round up of high quality blog posts and links for the week.

Rails now tested on Travis CI

Setting up continuous integration for Rails has been a complicated undertaking in the past.

Rails needs to be tested against different Ruby versions and various modes (such as running test cases in isolation/non-isolation, running ActiveRecord with identitymap enabled/disabled). This made the test suite run for an isanely long time (up to 2 hours on 1.9.2 alone) and required regular maintenance by
the Rails core team.

Over the past weeks the folks at Travis CI have been working hard to provide a better experience to Rails continous integration and today we can happily announce that
Rails is now testing on Travis CI!

Travis CI is doing a great job in providing multi-ruby testing capabilities and it is dead-simple to use. There’s some great potential to this project and it might change the way we see open-source development and testing quite a bit.

So, if you are publishing any kind of open-source code, library or web application, we recommend you have a look at it. And if you have a spare hour once in a while then consider potentially jumping on board to help improve the code base.

Travis CI is using a separate physical worker server (and a quite beefy one!) for running workers dedicated to Rails builds. This server has kindly been sponsored by the great folks over at Enterprise Rails.

[Guest post by Josh Kaladerimis & Sven Fuchs]

How Can We Develop For Tomorrow’s Needs?

How Can We Develop For Tomorrow’s Needs?

This guest post is by James M. Schorr, who has been in IT for over 14 years and been developing software for over 11 years. He is the owner of an IT consulting company, Tech Rescue, LLC, which he started along with his lovely wife, Tara, in 2002. They live in Concord, NC with their three children Jacob, Theresa and Elizabeth. James spends a lot of time writing code in many languages and a passion for Ruby on Rails in particular. He loves to play chess online at FICS (his handle is kerudzo) and to take his family on nature hikes. His professional profile is on LinkedIn and you can read more of his writings on his blog.

James M. Schorr The average developer is often forced to get code out the door as quickly as possible, primarily due to unrealistic deadlines and budgets. As a result, the quality and future expandability of software is greatly harmed. Software is now used in medical machinery, our vehicles, power plants, stock markets, aircraft, weapons, etc… As software becomes more and more critical in our lives, the need to think long-term is becoming increasingly critical.

Obviously, the quickest way is almost always not the best way. I hope to give some practical steps to those involved in software development that will help in the development of stable, long-lasting software. A proper strategy session involving the below steps can help save a lot of wasted time and money.

Quality, future-resilient software is tough to define, but reveals itself when it does what it’s supposed to without unpleasant surprises, handles unpredictable user input and system issues in gracious, non-devastating ways, and, in general, makes the user’s life easier. The tough part is that user’s needs and systems change. How do we engineer for tomorrow’s needs?

The keys to successfully developing long-term software are:

Establishing the Purpose: What is the point of the software? Do the needs that it are anticipated to be met look as though they will be the same core needs in the foreseeable future? In other words, will the main needs be met by this software and can we easily build out from there? If not, we need to keep the anticipated future needs in mind as we “scope” out the architecture of the project and provide “space” for them.

Choosing the “Stack”: (what technologies, languages, etc… will be used). The stack should be chosen carefully, based upon:

  • proven stability. For example, it may be “cool” but unwise to write the software in the latest-and-greatest language. I’ve seen instances where a language/framework is chosen strictly due to its current popularity. This is typically a recipe for disaster, as those who go (and enjoy) that route typically move on to the next greatest thing, leaving code behind for non-fad-following developers to handle.
  • current in-house knowledge. For instance, maybe our developers love and know Ruby, should we really force them to write an app in VB? Or perhaps it is a Microsoft-shop, are time and funds available to facilitate the learning-on-the-fly of non-MS technologies? I don’t believe that it is ever appropriate to write mission critical software using a language/framework that is unfamiliar to the developers. There are times, however, where the software is so mission-critical and matches a language’s abilities to the point where it makes sense to pull in new talent. It can be argued that software can be written in almost any language, that the language itself doesn’t matter much. But sometimes it really does, both in terms of expressiveness and developer satisfaction (note: I still contend that a happy developer is a good developer, or at least becoming one).
  • infrastructure requirements. Do we have the hardware and network necessary to decently support the software and its anticipated usage? Disk space, memory requirements, OS, network speed, etc… All of these matter, a lot. It’s best to always plan for 2-3x the anticipated usage. For instance, for a web app, if we anticipate 1k users, let’s build for 2-3k users, with built-in monitoring of the resources being used and a plan of how to scale up quickly when we hit a “soft” threshold.

Planning:

  • Architectural Drawing: I’m a big fan of having at least the “skeleton” of the project drawn out, particularly on a white-board (I’m a bit old-school, I know). It doesn’t have to be a fancy diagram or complicated UML diagram, just a simple drawing; the more understandable, the better. This high-level overview provides guidance when we’re deep into code, as we can look up and see if we’re on track (as it’s all too easy to go down a code “rabbit trail” if we’re not careful). It is counterproductive, however, to draw out every little detail, as this will stifle creativity and overwhelm us while we’re writing code (we just won’t look at the diagram then).
  • Establish Deadlines: we do need to know the deadlines. It’s best, in my opinion, to have several small deadlines with a semi-flexible final deadline. This helps us keep on track and measure our progress little by little. As we hit the small deadlines, our confidence builds, which then improves our productivity and, in general, our code quality.
  • Using Available Expertise Wisely: does it make sense to assign Bob, the awesome Python programmer, to doing CSS and Bill, the great designer, to slinging code? Obviously not (I have seen some managers try this, though), we may lose both team members or end up with Google copy-and-paste code and animated GIFs in our Project. Cross-training is a nice and potentially valuable concept, but it should be done outside of a software project with its accompanying deadlines. Future minor features might provide a better opportunity ground for cross-training. If Bob’s swamped, maybe we need to find him some decent help. :)
  • Determine the Deployment Strategy(for both during and after the Project):
    • code should be checked into our version control system prior to any deployments.
    • maybe we should only deploy code after business hours after alerting such-and-such a group. If our project has any possibility of negatively impacting others, notification is not only kind, but often necessary, especially for large changes.
    • a rollback strategy must always be in place. This strategy must be easily understandable with simple steps, so that little, if any interpretation by support staff, needs to be done in the “heat of the moment” support calls. Even if our developers are top-notch, until code gets into Production, we cannot be 100% sure that it will not need to be rolled back. This is why major companies often have to release an update quickly after a major release. Some things just can’t be easily discovered until they’re released into “the wild”.

Building with Expansion in Mind:

  • One of the wonderful aspects of developing software is also its most dangerous aspect: flexibility. A feature or component can often be written in different ways. There typically are only one or two best ways, though. It can be very difficult to determine, unless one steps back from the project and thinks it through. Well-known software principles help a great deal with this, but come up short if they are not “placed up against” the anticipated needs of the future (in other words, if we don’t understand where we’re going, our code will still be awful even if we follow DRY, OOP, GOF, etc…). As much as possible, this needs to be done not by the developer but by someone outside the code, so to speak, perhaps a technical team lead, etc…
  • When adding core features, we need to at least take a few moments to think through possible future implications of what we’re doing. For example, our Component A is currently parsing JSON from website B using C credentials. Component D depends upon Component A’s data. Wouldn’t it make sense to have these in an encrypted setting field somewhere to make it easy to change in the future? If Component A’s data was slightly different, would Component B “blow up”? Maybe we can abstract all of this a bit?
  • Avoiding Spaghetti-Code: proper design and a commitment to sticking to the design in the future helps to prevent our code from such entanglement. In other words, we need to commit to never, ever quickly throwing code in to the project, as this leads to “spaghetti-code”. Of course, there may be the exceedingly rare occasions, where we may need to do such a stop-gap measure due to an emergency, but we must then learn from our mistake and commit to re-engineering that portion of the code properly.

Data Safety:

  • As we depend more and more upon data, it’s becoming increasingly important that we do our best to have automated backups, which are then checked frequently by a person. This cannot be emphasized heavily enough. All too often, properly designed backups can stop working without anyone noticing until it is too late.
  • If encryption is used:
    • the encryption keys need to be stored off-site in at least 2 secure places. Imagine if we lost our server(s), our office burned down, our VPS provider goes offline, etc…- even if we had backups, could we get to the raw data if needed? No one wants to start over from scratch.
    • Does the encryption depend upon a certain cipher? If so, what is the game plan for when that cipher is cracked someday? How easy will it be for us to move to a new cipher?
  • Does our data depend upon a specific version? For instance, maybe database X version Y can open the data but no other versions can. Do we have a backup of that version to access the data if needed? Better yet, this reveals a key flaw in our design. Our data should not be heavily dependent upon any software version.
  • Would our data be understandable if a new developer 10 years from now is assigned to work with it? For instance, if a column for a user’s API Key is called usrscr_ak12, we may understand it, but it’s not future-proof (a better term may be “future-resilient”, since nothing is truly future-proof). Such obfuscation attempts provide little security, as if someone can get that far (to the data), we’ve lost the security “battle” anyhow. Data should be clearly understood by those who can access it.
  • Can our data be exported easily when the software that we’re lovingly developing now someday gets decommissioned? All software will eventually get replaced by something better. How easily can our data be decoupled from our application?

Pin-pointing Possible “Dominoes” in our project and code-base (e.g. if A happens, does it affect B, which then affects C, etc…, these can be like dominoes). Amazon’s recent AWS issues in 2011 revealed the criticality of this step. The more time that we spend anticipating what can go wrong, the more we can establish quick steps to both prevent such issues and to mitigate possible damage. At the bare minimum, the possible “dominoes” and recommended quick steps need to be written down somewhere. This can greatly help to expedite future troubleshooting.

  • Our Software: We must try to anticipate, as much as possible, what the interdependencies are in our project and its surrounding infrastructure. These dependencies should be in written form and re-reviewed as further functionality is added to the software in the future (e.g. ITIL Change Management).
  • Dependent Software: What software or systems will depend upon our software? When our system goes down, will other software be slamming our system asking for a response?
  • Dependent Systems: if we saturate our network, is our software designed to “back-off” and retry after an appropriate, randomized delay?

Obviously, none of the above can be done overnight. If even some of the above is done, however, the chance of our software having a longer-lasting, positive impact will be greater. I recommend that the start of each project have at least a 3-5 days dedicated to going through these steps. Gathering input from the teams of people who are responsible for various components (e.g. clients/end users, network, sysadmins, developers of other dependent software, etc…) will be invaluable. The payoff will be great.

I hope you found this article valuable. Feel free to ask questions and give feedback in the comments section of this post. Also, do check out James’ other article: “Do You Enjoy Your Code Quality?” on RubyLearning. Thanks!

Technorati Tags: ,

#276 Testing Time & Web Requests

It can be difficult to test code that deals with the current time or an external web request. Here I show you how to do both using the Timecop and FakeWeb gems.

Why don’t you use and review these useful Ruby Gems?

Showcasing some Ruby Gems from developers like you and me

Why don’t you try out some of the Ruby Gems mentioned below, built by developers like you and me, and review them? Maybe there are some real ‘hidden’ gems out there, wanting to be exposed!

ascii-data-tools developed by Jake Benilov. In his own words – “It provides a suite of tools for identifying, reading, enriching and editing ASCII data records. Such records are commonly used for data transfer within the banking (e.g. transfer statements between banks) and telecommunications sectors (e.g. call detail records).”

The subject matter may be quite dry, but for me this gem has been a training dojo of sorts and hence may be of interest to you. I’ve tried to do several things with the development of the code:

  • I’ve polished the code (some parts several times) in order to learn how to follow Uncle Bob’s clean code concepts and good OO principles (“tell, don’t ask”, etc)
  • I’ve developed it using acceptance-test driven (with cucumber) and test-driven (with rspec). I’ve tried to apply Specification by Example ideas to push the tests to the status of living documentation.
  • I’ve tried to utilize Rubyisms (lots of composition with mixins, some meta-programming, internal DSLs) to maximise code clarity, simplicity, extensibility and testability.

constructable developed by Manuel Korfmann. In his own words – “constructable is a macro, kinda similar in spirit to attr_reader and attr_accessor, that makes the new method accept a hash, kinda like the way ‘create’ does on ActiveRecord models.”

copyrb developed by Milan Dobrota. In his own words – “I have written a quick gem that allows you to copy and paste Ruby objects across terminals. This gem was created primarily to simplify the process of copying objects between different Rails environments for people who spend a lot of time in the Rails console. For more details.”

document_mapper developed by Ralph von der Heyden. In his own words – “A simple model layer that let’s you query text documents as if they were a database.”

green_shoes developed by Satoshi Asakawa. In his own words – “green_shoes is a Ruby domain specific language for beautiful Desktop Applications. The green_shoes dsl is so simple, even your pointy haired boss can understand it. The green_shoes project is based on _why-the-lucky-stiff’s Shoes, except for the following:

  • green_shoes source code is all Ruby, so even you can contribute.
  • green_shoes takes the Ruby DSL block-style approach, so all you have to do is write what you know: Ruby.
  • green_shoes is a gem, so you can simply install with this command: gem install green_shoes

guard-rails-assets developed by Dmytrii Nagirniak. In his own words – “Automatically compile Rails 3.1 assets when files are modified. You can use it to automatically run the JavaScript tests and always have the files ready for it. It uses guard gem and Rails 3.1 built-in assets pipeline. Works great in combination with guard-jasmine-headless-webkit.”

ip-world-map developed by Rene Scheibe. In his own words – “ip-world-map can be used to visualize web access logfiles (Apache format) on a world map. It performs geo-location resolution on the IPs and can generate fixed images, animated images or even videos.”

jsdebug-rails developed by Jeremy Peterson. In his own words – “On June 16, 2011, Ruby Rogues described using puts as the best way to debug Ruby code, in same way, I wanted a way to log JavaScript debug statements in Firebug’s console. The main purpose of jsdebug-rails is to provide console wrapper for debug statements in JavaScript code during development, thus including the file name, line number, and comment/object. Once in production, all debug statements are removed from the JavaScript source code, so they are never processed and much smaller.”

methodfinder developed by Michael Kohl. In his own words – “This isn’t the first Ruby port of Smalltalk’s Method Finder, but I couldn’t find the old one and so decided to write one myself for the benefit of the RubyLearning.org students. After being mentioned on the Ruby5 podcast, the project really caught on and I got some very good feedback as well as feature suggestions and patches. The main purpose of the library is helping new Rubyists find methods they didn’t know about. There are various usage examples in the README.”

omelettes developed by Mark Simoneau. In his own words – “omelettes is a low-to-no configuration database obfuscation gem for ActiveRecord that allows you to remove sensitive data from the database and replace it with meaningless words that are the same length. It also integrates with Faker automatically and allows for full configuration in a single file.”

rails-web-console developed by Rodrigo Rosenfeld Rosas. In his own words – “Some time ago I was planning to write an article comparing Rails and Grails and while trying to figure out what I did find useful in Grails that was missing in Rails, the only think I could think of was the console plugin for Grails. So, I decided to write one for Rails, which took only about an hour… It is just an interface for running Ruby commands in the context of a controller of the application. Think of it as the Rails console on the web, although with no auto-complete (yet). I was planning to add auto-complete, syntax highlight and other features, but couldn’t find any free time for doing that. There’s already some auto-complete examples in Github using websockets as well as Javascript code highlighters for Ruby available on the Internet. It is just a matter of getting some free time… :) Since, I prefer the new Hash declaring syntax, this gem is not compatible with Ruby 1.8, but there’s a fork of it just for allowing its usage in Ruby 1.8 (replacing the new Hash declaration style with the old one).”

secretsharing developed by Alexander Klink. In his own words – “It is not on GitHub, but I’m hosting the git repository myself. Cryptographic secret sharing is one of the lesser known cryptographic techniques. I’ve implemented the most prominent version, which was invented by Adi Shamir (the »S« in RSA) and can be used to share a secret (such as a password) between a number of people (let’s call that n) and only recover it if a certain number (k <= n) come together and combine their secret shares. Interestingly enough, if less than k people combine their shares, they learn nothing at all (in an information-theoretic understanding of »nothing«).”

sequel-jdbc-hxtt-adapter developed by Colin Casey. In his own words – “I had to do a lot of work with MS Access databases at my previous job and I found that the options for interacting with these databases programatically on the Windows platform left a lot to be desired. JRuby was beginning to make Ruby development on Windows a lot less painful so, using that interpreter and a pure JDBC driver, I created a Sequel adapter for working with MS Access files.”

SGFParser developed by Aldric Giacomoni. In his own words – “SgfParser allows you to parse, create and save SGF files. SGF (Smart Game Format) is a plain text format used to record moves in various board games, most famously Go/Weiqi/Baduk but also Chess and Backgammon. When I started this project, there was only one available, and it was a relatively tricky-to-find Ruby file online; no gems. This one aims to be the fastest (currently parses a 1.2 Mb SGF in ~3 seconds) and eventually also keep track of all the potential parsing or format errors that may have occurred without dying.”

standalone-migrations developed by Todd Huss. In his own words – “standalone-migrations allows you to easily use Rails migrations in non-Rails and non-Ruby projects. My company, Two Bit Labs, develops iOS and Android apps for companies looking to go mobile and we usually use Rails on the server side. However, some of our clients use other server side platforms and we noticed that often our non-Rails clients struggled to effectively version and manage their database schema. So in 2008 we created standalone-migrations so non-Rails shops can enjoy all the database migration goodness that Rails offers. It’s been great to see standalone-migrations grow over the years with an active community!”

xapian_db developed by Gernot Kogler. In his own words – “xapian_db is a Ruby gem that combines features of nosql databases and fulltext indexing. It is based on Xapian, an efficient and powerful indexing library.”

I’ll be updating this page from time to time. If you have written a Ruby Gem and want to showcase it here, please email me at satish [dot] talim [at] gmail.com. If you know of some real useful, ‘hidden’ Ruby Gems, please let us know by commenting on this blog post. Thank you for your time and help.

Technorati Tags: , , ,

US Default looms large as Politicians squabble

I turned on Meet the Press at 8am Pacific time this morning to see one of the Gang of Six declaring that he would not agree to severe cuts in spending so long as President Obama insisted on raising taxes for the wealthy. I had to shake my head and clear my thoughts as my brain started to focus on what I was hearing.

The Guardian leader that has appeared in the UK publication’s Monday morning edition (see graphic) makes clear the fears others in the world have for the consequences for the world economy if the US defaults on its debts.

The song that came to my mind was “The Lunatics have taken over the Asylum”. And by this I don’t mean only republicans but politicians in general and the asylum that is Washington DC right now.

As a recent US Citizen I view these events partly through the eyes of a Brit. I’m 54 years a Brit and only 2 an American.

America is in relative decline as the worlds strongest nation. China is growing in both absolute and relative terms. The failure to decide on an approach to paying off the national debt is merely a symptom of the American political class failing in the game of world leadership. Petty domestic squabbles dominate debate even as major global events unfold. The thought that best sums up the future of the global economy, and with it humanity, under the leadership of these politicians is “Oh Shit!”.

Here’s hoping somebody reads this and remembers the world scale consequences of the failure of American leadership. It isn’t about the Congress or about the White House. Its about the whole world and the rest of this century as the world copes with America’s decline as the unchallenged global leader.

Ruby 1.9.2-p290, Rack 1.3.1, Amazon Ruby SDK, Rails 3.1 hackfest

Topics for the week are Ruby 1.9.2 p290, Rack 1.3.1, Amazon’s official Ruby SDK, a book review, and the usual round up of interesting gems and projects.

In Which I Blather About Self-Publishing

So I tend to keep an eye on interesting things in the Ruby self-published technical book space.

This isn’t exactly recent, but I did want to mention and endorse Avdi Grimm’s Exceptional Ruby. This is exactly the kind of thing that should be happening in the self-publishing space. It’s a brief, thorough exploration of a very specific topic, in this case error and exception handling in Ruby. You may think you understand Ruby’s error mechanisms, but I’m pretty sure that unless you are actually Avdi, you will learn something both about the mechanics or Ruby’s exception handling and how best to robustly integrate error management into your code.

The book is clear and authoritative, not least because Avdi is up front about certain code patterns that he is presenting but has not had wild amounts of experience with. I like when a technical author is comfortable enough to admit less than perfect omnipotence.

Anyway, it’s $15 dollars at both exceptionalruby.com and at Pragmatic – Avdi worked out a deal where Pragmatic is co-distributing the book. Unlike what I did with Rails Test Pescriptions, Exceptional Ruby is just being distributed by Pragmatic – it didn’t go through Prag design or edits, and Avdi also sells it on his own. This strikes me as a very interesting experiment, and I hope it works out for both sides.

Which brings me to the latest from Thoughtbot, namely a new ebook on Backbone.js and Rails.

I need to start with a disclaimer – there’s a sense in which this book partially competes with my Not Completely Announced JavaScript book. Also, I haven’t read what they have yet, although quibbles aside, there’s a good chance that I’ll buy it. Also, I use all kinds of Thoughtbot tools and I have a lot of respect for them as a team.

Okay, we clear?

Thoughtbot is producing a new ebook – apparently as a group, because they aren’t listing specific authors. (As a matter of marketing, and consumer confidence, I really recommend that they list the authors…). They have an extensive outline on the site, although they don’t estimate the final page count. The distribution model is interesting – when you pay, you get access to a github repo containing up to the moment source code that can be converted to a variety of formats. (Though it is unclear if any format other than HTML is supported immediately). The outline appears to be solid, can’t quibble with any of that yet.

Now how much would you pay?

It’s $39. Until August 1, when it goes up to $49.

That is a bold pricing strategy.

Which doesn’t mean it won’t work out fine for Thoughtbot.

To put that $49 in context, that’s more than double most Pragmatic ebooks. It’s almost triple what O’Reilly sells their new jQuery Mobile book for, which would seem to be a reasonable comparison, topic size wise. And, of course, it’s five times what I self-published for a few years ago. More on that in a second.

Offhand, I can think of three reasons why they might try that pricing strategy:

  • Value pricing: if this book saves you an hour of research or coding, then that easily covers the price for most purchasers. Which is true, if not exactly how the market prices technical books. But if you look at it as a fraction of what a typical training course costs, then it makes more sense.

  • Profit pricing: Thoughtbot may have decided they need to price at this level to make the cost in time spent on the book worthwhile to them.

  • Boutique pricing: Thoughtbot may want to place a marker down as experts in this space, but limit their customers to people who are seriously interested in both the subject and the somewhat interactive model they have chosen for creating the book.

All these arguments seem valid, or at least defensible, and I have no idea what Thoughtbot’s strategy actually is. I’m really curious to see what happens, though. I’ll report back after I buy it and they get some time to put content in place.

It seems to me like part of the marketing side self publishing a book is giving potential readers as few reasons as possible to not buy the book. Right now, there are a lot of reasons here – the price, the lack of a sample, the lack of an author or publisher who might have a track record (although obviously Thoughtbot itself has a great reputation, and this will be more meaningful to some people than others). Some of this will correct itself over time – if things are good, they’ll get some word of mouth – testimonials would be a great thing on their website.

Which is a roundabout way to say a couple of things about self-publishing based on my own experience.

First is that it’s wildly clear that I chickened out on the pricing. (Rails Test Prescriptions was $9 when I self published it.) I could make up something about how I was pursuing a mass-marketing strategy and blah, blah, but the simple fact was I was petrified that nobody would see value in it if I went above $9. In retrospect, I probably could have gotten away with a higher price.

The other mistake I made – or what could potentially have been a mistake – was to not set a finite goal. Avdi, I think, has this exactly right. He took a specified chunk, wrote about it, and called it a day. When I first started with RTP, I said a lot of things about how I would keep the book updated. It didn’t occur to me until after I started selling that I was setting myself up for an infinitely long task. Again, though, I was afraid that there would be a lack of perceived value in a book that ended, and obviously one of the potential advantages of an ebook is ease of updating.

One thing the Thoughtbot team has right, though, is an easy way of updating – their book will be a git repo. I hope, though, that they’ll separate in-progress from edited and reviewed by putting them on separate branches. I messed this up a bit by assuming that my distribution channel could handle it and then having to come up with a stopgap fairly quickly.

I will say, though, that publishers and editors are somewhat like the government in that when they do their jobs well, you don’t realize how much value they can add. There’s a process for doing things that the author might not want to be involved in – design, indexing, distribution.

I don’t have a grand conclusion here – I’m very interested to see all kinds of experiments with self-publishing, I don’t think anybody really knows how things will play out.

Filed under: eBooks

A Review of The Book of Ruby – Pleasant Prose Meets Car-Crash Code

I don’t like being negative on Ruby Inside without good reason. Trivia like DHH’s test library preferences can provide a fun talking point but pointing out specific flaws in someone’s work is rarely insightful.

I wasn’t going to publish a review of this book, but when I discussed the issues with people on IRC, Twitter and e-mail (trying to find someone who’d say something good about it), I was told several times to publish my thoughts, primarily to serve as a warning to newcomers who may pick it up. I agree.

What is The Book of Ruby?

The Book of Ruby is a new Ruby book published by No Starch (who, as a publisher, I love – The Linux Programming Interface is one of the best books I’ve ever read) and written by Huw Collingbourne. It came out this month (July 2011) and is available in print and e-book formats as well as on the Safari Books subscription site.

While other books focus on Ruby’s trendier features, The Book of Ruby reveals the secret inner workings of one of the world’s most popular programming languages, teaching you to write clear, maintainable code.

Sales page for TBOR

This book was spawned from a freely available PDF Huw wrote in 2009, also called The Book of Ruby. (Notably, it’s included with the Windows-based RubyInstaller.) But you can’t just follow along with this review using the free PDF since it has been rewritten, tech-reviewed (supposedly by Pat Eyler) and includes fresh stuff on Ruby 1.9 and Rails 3. The TOC remains similar, nonetheless.

Who’s Huw?

Huw’s main claim to fame is as author of Ruby In Steel, an awesome Ruby development environment for MS Visual Studio. I’ve written about it a number of times and continue to recommend it.

Despite what I say about the book, I have nothing against Huw (having only traded a few e-mails) and have admired his work from afar. I can’t believe how he gets so much done with his Visual Studio extensions which, I emphasize, are great (if you’re on Windows).

But let’s get to business.

So, The Actual Review

First impression: Huw writes well. He’s never overly familiar, nor dry. He has a clear interest in Ruby and proceeds at a good pace through a wide selection of topics that’ll be of interest to both Ruby newbies and more advanced developers.

Sometimes the coverage is a bit shallow. Despite the first chapter being called “Strings, Numbers, Classes, and Objects” — a pretty wide range of topics — it lasts a mere 13 pages. The “Numbers” section is 3 paragraphs long. This struck me as odd for a book whose sales blurb says: “The Book of Ruby reveals the secret inner workings of one of the world’s most popular programming languages”

The book is interesting as something to browse through or if you’re an experienced developer from another language who’s OK with learning one (frequently non-standard) way of doing something in Ruby. Huw moves quickly and frequently probes into some interesting elements of syntax and underlying language functionality on the way (mostly in the Dig Deeper sections at the end of each chapter – a nice touch).

Some chapters are quite strong and dig into some interesting crevices. The chapters on Symbols, YAML (which rarely gets much coverage in other books), Marshal (ditto), Threads, and Conditional Statements are pretty solid and you’ll pick up some interesting things to remember. Little more than you’d pick up from The Ruby Programming Language or the Pickaxe, though.

At other points, things get a bit confused:

This is perhaps the first time I’ve seen an author admit that they don’t know the answer to a verifiable and straightforward problem in a book that claims to reveal “secret inner workings.” While the author says that I can’t get Ruby to tell me, Ruby will certainly do so (this will work on MRI Ruby 1.9):

require 'ripper'; require 'pp'; pp Ripper::SexpBuilder.new("puts{}.class").parse

(A note for the intrigued reader: {} is being treated as a block being passed to puts. puts returns nil and then class is being called on that.)

Despite the odd confusing moment, though, the general problem with this book doesn’t ultimately lie in the author’s writing style or even his approach which varies from fair to great. The problems orient themselves around something potentially more important than all of that..

Inconsistent Code and Style

The code in the book is inconsistent not only in regard to established Ruby style but from page to page of the book itself. It jumps between conventions even on basic issues (and this is only scratching the surface):

puts 'hello world'              # on page 1
puts( "Hello #{name}" )         # on page 2
puts(Class.to_s)                # on page 12
puts("Thing.initialize: #{self.inspect}\n\n")    # on page 25
abc(a, b, c ){ puts "four" }    # on page 166

The author makes a point on several occasions about how parentheses reduce ambiguity in the code, but if you like Lisp you’ll love this book because there are parentheses almost everywhere. Except.. when they seem to get forgotten. You’ll see one code sample full of them in a particularly un-Rubyish fashion and on the next page with none. Odd.

Similarly, the code has wandering indentation. Rather than the 2 spaces to which Rubyists are most accustomed, the book uses 4 spaces. Sometimes 2. And sometimes 6. Actually, just pick a number between 1 and 6:

Also be prepared to get almost no grounding on variable or method naming conventions. The author isn’t keen on them and instead switches between snake_case, allinasinglewordcase and CamelCase on a whim.

When challenged on some of these issues elsewhere (notably on Reddit), the author pointed to an article he wrote called Programming With Style in which he said:

So, when I switch from one programming language to another do I change my coding style to fit the language? The answer is: up to a point. Or, to put it another way: as little as possible.

Huw Collingbourne

It’s no surprise, then, that the style in this book is not only un-Rubyish but that it’s not even consistent to any other language, it’s a mishmash of Smalltalk, Java, C, and some odd language that has no consistency in indentation. Chad Perrin wrote an article about the issue in response to Huw’s comments.

Chad Perrin sagely notes:

I may disagree with Huw Collingbourne’s choice of coding style, but do not much care if he uses it for his own private purposes. I just care that he replaces idiomatic style in a book designed to impress good programming practice on new students of the Ruby language. Even that was not the reason I objected to the coding style in my review, though: ultimately, the reason I rated the coding style poorly in the context of that book is that it becomes less readable for me, thus dissuading me from buying it.

Chad Perrin

And Just Plain Weird Stuff

Some things in The Book of Ruby are just plain weird.

The author is not keen on do .. end and brings out the curly brace form (usually reserved for one-liners) in (almost) every case. Coupled to that, though, he often doesn’t (though not always) like putting block arguments on the same line as the start of the block either. This leads to really weird looking programs. I have to show you an example. I’ve taken a screenshot of the book just in case you don’t believe me, but this is code straight from page 173:

Any competent Rubyist could read and understand this code but, and I know I’m not the only one here, seeing this sort of code rapidly leads to thoughts of “Who wrote this!?” and “What’s the deal here?” Style is not solely an anally-retentive attempt to get everyone writing the same way. It’s also a way to recognize who follows reasonable conventions and an indicator (like it or not) that can cause us to form a premature opinion about someone’s competence.

Again:

I’m not in the mood to trawl through the entire book picking it apart, but any experienced Rubyist will find a lot of nits to pick, including:

  • The term ‘interpolation’ is never mentioned. It’s introduced as ’embedded evaluation’ and referred to as ’embedded’ code throughout the book.
  • The %w array creation technique is explained as being “unquoted text separated by spaces between parentheses preceded by %w”, although parentheses, in particular, aren’t mandatory at all.
  • Extra thens all over the place. But not always. Just sometimes.
  • The “Iterating over Arrays” section explains one way to iterate over an array: the for .. in .. loop.
  • An attempt is made to see if an object has a singleton method called “congratulate” by using item.singleton_methods.include?("congratulate") – this won’t work in Ruby 1.9 since singleton_methods returns an array of symbols, not strings. This matter is only cleaned up 2 pages after it’s used.

Many code examples are a little odd or inconsistent even given the context of the writing surrounding them. Just a handful:

Chad Perrin talks about this more in his review of what he calls The Book of Weird Ruby.

Who Should Buy It?

If you’re an intermediate or expert Rubyist, you’re probably going to pick up or be reminded of something useful, so it might help fill a few holes in your knowledge. It might also act as a grim reminder of why teaching and maintaining a consistent style is important.

If you’re a total Ruby newbie, I think you’ll learn too many bad habits from this book for me to recommend it with any sincerity. If you’re still interested though, you’ll already need to know what things like variables, methods and strings are (all are mentioned on the first page without any explanations) so it’s not for total programming newbies. If you’re willing to ignore the book on matters of style, give it a try, there’s enough interesting stuff to see.. but you could just as easily read Eloquent Ruby instead and pick up things on both angles.

The Chapters

Here’s a brief outline of the main contents (you can get a deeper look in this PDF):

  1. Strings, Numbers, Classes, and Objects
  2. Class Hierarchies, Attributes, and Class Variables
  3. Strings and Ranges
  4. Arrays and Hashes
  5. Loops and Iterators
  6. Conditional Statements
  7. Methods
  8. Passing Arguments and Returning Values
  9. Exception Handling
  10. Blocks, Procs, and Lambdas
  11. Symbols
  12. Modules and Mixins
  13. Files and IO
  14. YAML
  15. Marshal
  16. Regular Expressions
  17. Threads
  18. Debugging and Testing
  19. Ruby on Rails
  20. Dynamic Programming

In Conclusion

It’s pleasant to read Huw’s writing. He panders to the reader just the right amount and strikes a good balance between being over-familiar and dull. You’ll enjoy his explanations and find his pacing pleasant for the most part, even if the depth isn’t always there. As Steve Klabnik said on a ruby-talk discussion about the book:

I’ve actually read the book (admittedly skimmed in parts), and it’s a fine book, with one exception: The author uses a very non-standard
coding style. You can see it in the example chapter. So, good for learning, except no Ruby code you ever read will look like that.

Steve Klabnik

What kills the book, however, is its disregard for code consistency and long-standing Ruby style conventions. More worryingly, the author knows this and seems not to care. Tab-sizes jump from 4 to 6 and back again, there’s bad spacing all over the place, there’s no consistency with variable name formatting or the use of parentheses, and, in general, the style used is little like any Ruby code I’ve come across before (and I’m reviewing and code-walking Ruby code from hundreds of developers on a regular basis). If you learnt Ruby only from this book, you’d pick up a lot of bad habits to correct. I can’t believe for a minute that the named tech reviewer, Pat Eyler (a stand up guy in the Ruby world), signed off on all of this without a fight.

Another comment from Reddit:

When learning a new language we apply the idioms of what we know best from our past into this new language until we’ve learned the new language well enough to be fluent and use it’s own idioms. However, I don’t think any of us(except you Huw) would be so bold as to write a book half way through this process, having given up on learning idiomatic ruby and attempting to convince the rest of the world that your way is the best way.

crassnlewd

This is the first No Starch book that has given me any doubts about their editorial process and I hope it’s a mere aberration. The last two books I read from them, Eloquent JavaScript and The Linux Programming Interface seriously impressed me and continued the positive impression I’d formed over the years.

Usually this is where I’d link to ways to get the book. I’m not going to elaborate on this, however, as I don’t want to profit from it. Google “The Book of Ruby” or find the link to No Starch’s product page at the start of this review if you want to grab a copy.

Lately, if you think this review is missing the mark, feel free to leave a comment or head over to Chad Perrin’s review which provides a slightly different take.

Fifteen Protips for Conference Speakers

Do you dream of someday speaking at a technical conference? Have you spoken at a conference but felt like your journey to the podium wasn’t as smooth as it might have been? Well here are a couple tips to make things go smoothly and endear you to your conference organizers.

I’m writing this from the perspective of a conference organizer where my main focus is the technical program. I’ve run into a lot of these issues when putting together Golden Gate Ruby Conference, and also seen things from the other side when speaking at other conferences.

A lot of this list is about not being a problem for the conference organizers. I hope that doesn’t come off as too negative, but I figure most speakers don’t realize the potential impact of seemingly little things. Making things easier for the organizers makes for a better conference for everyone, and your presentation will be even more awesome.

  1. Respect your conference organizer’s time. Organizing a conference is far more work than you realize, and for small, regional conferences it’s usually volunteer work. Managing the program is extra fun because dealing with a bunch of speakers makes herding cats look as easy as napping on the beach. There are speakers I’ll never have speak at my conference again because they are too hard to manage, even though they are awesome on stage. A good organizer will respect your time, and you should do the same in return.

  2. Respond to all emails promptly. Read the whole email, and answer every question asked of you. This may seem like kid’s stuff, but you’d be amazed at how many times I email a speaker and they never reply, reply without answering important questions, or miss the point of the email entirely. Then I have to send another email or two. Multiply that by a dozen or two speakers and you can see how that can create a lot of extra work. (also see #1)

  3. Get a good headshot. Any conference will probably want a photo of you for the website. Some confs want “professional” (i.e. boring) photos, while others like shots that show more personality, so maybe you want to have more than one handy. Either way, you want a photo that shows your face well.

  4. Have someone else write your bio. Most confs want a short bio of you for the website/program. Most people hate writing those things about themselves, so get someone who knows you to write one for you. Remember, this isn’t a resume to get a job. The point is to tell people why they should care about what you have to say.

  5. Don’t announce that you are speaking until after the conference does. Alright, some conferences won’t care about this at all, but most will want to manage their own publicity and control the timing of announcements. And sometimes speakers aren’t all informed at the same time whether their talk was accepted, so making your own announcement can confuse things.

  6. Do announce you are speaking! Once you know it’s cool to announce, do it! Conferences love the publicity, and you will too. Tweet it, blog it, Facebook it…

  7. Proactively communicate any special requirements you may have for your talk, scheduling, etc. It’s usually simple to deal with requests if they come early enough, but can be impossible if they come the day of the conference. Things that might require special attention:

    • you can only speak on one of the days of a two-day conference
    • you need an uncommon connector for your laptop
    • you need the house lighting dimmed during your talk
    • you will have extra people on stage who also need microphones
    • you need a table on stage with power strips for your science experiment
    • you need a wireless microphone so you can stroll around the audience
    • you need a bar stool because you can’t stand for 45 minutes
    • you need wheelchair access to the stage
    • your talk requires network access
    • you need a lot of network bandwidth for your talk
  8. Prepare your talk in advance. You don’t want to be that guy who gets up on stage and says, “Sorry I didn’t have time to prepare my talk, so I’m just winging it.” Hundreds of people are giving you their valuable time to see your talk. The least you can do is respect them enough to prepare in advance. You’re also better off preparing your talk before the conference starts. Take it from someone who spent most of a RailsConf working on his talk instead of seeing other talks and enjoying the conference.

  9. Have awesome, readable slides. You can read up on how to make readable, attention-grabbing slides that effectively support your presentation. Please do. You can start with Shane Becker’s Better Presentation Slides lightning talk from GoGaRuCo 2010 (at the 45:05 timecode).

  10. Send your slides to the organizer. PDF is usually a good common denominator format, but including the original helps if your presentation has builds, video, etc. Sending multiple formats is great to cover all your bases. Some confs ask for your slides in advance, but that seems far less common these days when you don’t have an A/V team running your slides for you, so don’t forget to email your slides when you’re done with your talk. Even if you are posting your own slides online, send the PDF/originals to the conf as well so they don’t have to find them online to get them.

  11. Practice your talk. Run through it several times. Do it facing yourself in a mirror. If you can video yourself and watch it, that’s really helpful too. If you don’t have a lot of experience speaking, try out your talk with coworkers or friends who can give good feedback. And don’t be afraid to modify your talk based on feedback – that’s why they do previews for theatrical productions.

  12. Get some sleep. Don’t stay up to all hours partying the night before your talk. Nobody wants to be that guy who drunkenly fell off a fire escape and has to wear giant sunglasses to hide the black eye. You want to show up on time, rested, and raring to go.

  13. Don’t flake out. There’s nothing worse than not showing up. Canceling at the last minute is nearly as bad. If you think it’s likely you’ll have to cancel, don’t commit to doing it. If something comes up and you can’t make it, let the conference know as soon as you can.

  14. Look your best. For many conferences (GoGaRuCo included) wearing jeans and a geek t-shirt is great, while others want something a bit more formal. But even if you just do jeans and a t-shirt, you want something that you feel great wearing. A good rule of thumb is to dress one level up from how you’d dress as an attendee. Also, take off your conference badge when on stage – it’s distracting and looks bad.

  15. Have fun! Odds are you aren’t getting paid to speak, so you might as well enjoy yourself! Seriously, you’ll do a better job and be a more effective speaker if you are enjoying what you’re doing.

Hosting San Francisco Rails 3.1 Hackfest

The rails community is making the final push to get 3.1 out and is looking for your help! As part of a worldwide effort over the weekend, Heroku is hosting a local hackfest to help finalize Rails 3.1.

On Saturday, July 23rd from 12pm to 5pm, Heroku will be hosting a gathering for the
Rails 3.1 Hackfest. We’re looking for
people that want to improve things at all levels of the Rails stack – from debugging
to documentation. Come with apps to upgrade to Rails 3.1. We’ll also be working
on getting Rails 3.1 apps running on Heroku’s Celadon Cedar stack. If you haven’t done this yet, don’t miss the opportunity!

The Rails 3.1 Hackfest will be at our San Francisco office:

321 11th St in SOMA

Saturday, July 23rd, 12pm to 5pm

Beer and Pizza will be provided! Make sure to let us know you’re coming so we have enough food and we’ll see you on Saturday.

See the official announcement

#275 How I Test

Here I show how I would add tests to the password reset feature created in the previous episode. I use RSpec, Capybara, Factory Girl, and Guard to make request, model, and mailer specs.

Cryptography Or: How I Learned to Stop Worrying, and Love AES

Cryptography Or: How I Learned to Stop Worrying, and Love AES

This guest post is by Phillip Gawlowski, who is living in the German wilderness of Oberberg near Cologne. Phillip spends his time writing Ruby as a hobby just for fun. He tries to make life a little easier for himself and for others when he is crazy enough to release his code as open source. He’s neither famous nor rich, but likes it that way (most of the time). He blogs his musings at his blog.

Phillip Gawlowski A friend gave you the plans for Dr. Blofeld’s newest Doomsday Device. Over the engine noise of his Aston-Martin, he tells you: “Send this to offers@universal-exports.co.uk, and make sure it arrives there intact!”

All you have is a laptop, wonky Internet access, and Ruby. What to do?

AES For Safety, SHA2 For Integrity

You now have two goals:

  1. Make the Doomsday Device plans unreadable, and
  2. Ensure that the data has arrived at its destination without error.

Fortunately, Ruby provides an API to OpenSSL, a well-tested, widely used library and set of tools used for encryption of all kinds, and includes its own implementations of several cryptographic hashes.

In this article we will use AES for de- and encryption, and SHA2 to hash data.

Using SHA2

Like many things, Ruby makes creating crypto-hashes easy:

require 'digest/sha2'
sha256 = Digest::SHA2.new(256)
sha256.digest("Bond, James Bond")

The SHA2#new call provides us with the bit length we want our hash to have. SHA2 exists in two variants: 256, also called SHA256, and 512, called SHA512. A longer key length takes longer to calculate, but is also more accurate, and much more difficult to attack with a rainbow table or other cryptanalysis.

Once we have our SHA object, we pass a String of data into the #digest to have the hash of this data returned as a String.

You can call the #digest method directly when you are working with MD5 or SHA1:

require 'digest/MD5'
Digest::MD5.digest "Bond, James Bond"

The Advanced Encryption Standard

Theory

As AES is a so-called symmetric-key block cipher, it operates on chunks of data, called blocks, and applies the provided key to this block to create de- and encrypted output. The use of the same key for encryption and decryption is what makes the cipher symmetric. Conversely, asymmetrical ciphers use different keys for decryption and encryption, usually a private key known only to the recipient to decrypt, and a public key known to anyone to encrypt. SSH, SSL/TLS and PGP are examples for this kind of cipher.

The AES family has three modes of operation: 128 bit, 192 bit, and 256 bit. Just as with SHA2, you’ll find AES-128, or AES-256 being used to describe the particular block size that can be used.

The downside to this approach is that the same key is used for each block of data, which weakens the encryption (the same data is encrypted in the same way!). The solution is to use a so called “mode of operation”, which scrambles the cipher so that it becomes indistinguishable from noise.

A full discussion of methods of operation and their strengths and weaknesses would go well beyond the scope of this article, however.

…And Practice

Now let’s take a look at Ruby’s encryption API:

require 'openssl'
require 'digest/sha2'

payload = "Plans for Blofeld's newest Doomsday Device. This is top secret!"
sha256 = Digest::SHA2.new(256)
aes = OpenSSL::Cipher.new("AES-256-CFB")
iv = rand.to_s
key = sha256.digest("Bond, James Bond")

aes.encrypt
aes.key = key
aes.iv = iv
encrypted_data = aes.update(payload) + aes.final

puts encrypted_data

Since Ruby’s OpenSSL API is pretty straight forward (and so is the OpenSSL API, if you would like to use OpenSSL in C code), we will only discuss what’s really important.

OpenSSL::Cipher.new("AES-256-CFB") sets up an AES object, with a block size of 256 bits and the CFB mode of operation. To find out which ciphers are supported, OpenSSL::Cipher.ciphers allows you to interrogate the class for which ciphers are understood.

The iv variable stores our random Initialization Vector, random data to seed the mode of operation to ensure that each 256 bit block is encrypted uniquely, and thus (hopefully) indistinguishable from noise.

We also take advantage of SHA2′s 356 bit variant to generate a 256 bit password from a simpler password. AES expects the encryption key to be as long as a block of data, and since creating a 256 bit password from hand is pretty difficult, we let the computer do the job. When used in production, you most likely want to add a salt to the hash, or use a user’s already hashed password.

With the #decrypt and #encrypt methods, we put our AES object into the proper state. Behind the scenes, this initializes OpenSSL’s encryption engine. These two method calls are required before any other method call!

Last but definitely not least, the #update and #final methods are where the encryption actually happens. The more data you have, the longer the chunks, and the more complex the cipher, the longer this will take. The #final method does the same as #update, but ads padding to a chunk to bring it up to the required block size.

In case you make a mistake, or want to do another round of encryption or decryption, the #reset method can reset a Cipher object.

Decryption works pretty much the same as encryption, except that we pass the encrypted data to the #update-method:

aes.decrypt
aes.key = key
aes.iv = iv
puts aes.update(encrypted) + aes.final

Note, however, that both the key and the IV must be the same, and thus have to be stored or transmitted to the recipient of the encrypted data!

Verifying Integrity

As we’ve already seen, a hashing algorithm can turn data of arbitrary length into a fixed length, unique stream of bytes. This can function as password storage, to generate securer keys for encryption, or, since the output of a hash algorithm is deterministic (it’s always the same for the same input) as an integrity check.

If you’ve downloaded a Linux distribution or other software, you have already seen this, in the form of MD5 digests, with which you can verify that a download is complete and error free, like on Ruby’s homepage.

We will do the same with our encrypted data, as a poor man’s message authentication code–a technique in cryptography to ensure that a message has not been tampered with:

poor_mans_mac = sha2.digest(encrypted)

Now all that’s left is to send an email to James’ employer with the Doomsday Device plans, and to give them a call to give them the IV and key.

Closing Remarks

Think of the Future

Security is not a state, it is a process. You should write your security-aware code in such a way that you don’t depend on a particular cryptographic algorithm. Ruby’s API (and OpenSSL’s own API) wrap encryption abstractly, so that you can swap out the algorithm you use at any time. This is also necessary for hashing algorithms: While there are no feasible attacks against SHA2 yet, the cryptanalysis only gets better over time, as the histories of MD5 and DES show.

Schneier’s Law

Schneier’s Law states that “any person can invent a security system so clever that she or he can’t think of how to break it.” This is why Ruby’s developers use OpenSSL to do encryption, a widely tested and certified (in some variants!) cryptographic library, instead of writing their own library.

A mistake in your implementation can compromise your and your customer’s data, since so called “side channel attack” are used as a matter of course to attack cryptography.

Encryption Does Not Mean You Are Safe

It is important, and I cannot stress this enough, that you do not store encrypted data and the keys to access it on the same machine (ideally, you don’t store these things on the same network!), or do your encryption and decryption on the same machine that you store you encrypted data on. Whole libraries have been filled with books on how to design a secure system, from hardware to software. Above all, security is a mindset, and you have to be properly paranoid to secure your data and access to this data. Sooner or later, if you deploy, or are about to deploy, security relevant code have your code tested by outsiders. Penetration testing is worth your while.

Asymmetric encryption has been invented to solve one problem with encryption: It is not necessary for such a cipher to transmit the key. However, they have their own set of trade offs (key trust, and computational efficiency, among others).

The Safest Data is No Data

Like the fastest code is no code at all, if you don’t store data you don’t absolutely, positively have to store, don’t even bother with it. What you don’t have can’t be compromised.

Conclusion

This article is nothing but a superficial introduction to encryption in Ruby. There are dozens of standards and regulations that govern this vast topic. However, I have tried my best to give you, fellow Rubyists, enough knowledge about this topic for you to know which questions you should ask, which is, in the end, much more important than the code itself. Now go forth, and hash an encrypt and decrypt, and, above all, have fun doing it!

I hope you found this article valuable. Feel free to ask questions and give feedback in the comments section of this post. Thanks!

Technorati Tags: , ,

Ruby 1.9.2-p290 Released: The Lowdown on Ruby’s Latest Production Release

Over at the always-riveting official Ruby blog, Shota Fukumori has announced the release of Ruby 1.9.2-p290, the latest ‘patchlevel’ release of the current production release of Ruby MRI.

Patchlevel 290 is the first production-level patchlevel release of MRI since patchlevel 180 back in February so it’s worth upgrading if you’re on 1.9.2. The release post duly notes:

This release doesn’t include any security fixes, but many bugs are fixed in this release.

Shota Fukumori

So what changed? And how can you upgrade? Let me spill the beans.

What’s Changed From p180 to p290?

Quite a lot got changed in terms of the numbers. 132 files were tweaked with a total of 3505 lines added and 788 taken away.

A selection of the fixes:

  • require 'date'; Date.new === nil throws an undefined method error for coerce on p180 – this has now been fixed
  • The Thread.kill segfaults when the object to be killed isn’t a thread bug has been resolved.
  • Tweaks to reduce segmentation faults when using zlib on x86-64 Darwin (OS X) – always good
  • Modification to prevent random number sequence repetition on forked child processes in SecureRandom
  • Fix to io system to resolve a Windows-only bug where characters are being read incorrectly due to ASCII not being treated as 7 bit
  • A tweak to Psych (the YAML parser) to plug a memory leak
  • Load paths are now always expanded by rb_et_expanded_load_path (I think this might yield a performance gain?)
  • Fixes to Psych’s treatment and testing of string taint
  • Prevention of temporary objects being garbage collected in some cases
  • Fixes to resolve compilation problems with Visual C++ 2010
  • A fix so that Tk’s extconf.rb would run successfully
  • Lots of Tk related fixes generally – I’m guessing Tk is very popular amongst the core team, particularly in Japan, because it seems to be a common release blocker.
  • A fix to string parsing to resolve an obscure symbol-containing-newlines parsing bug

How To Upgrade to Ruby 1.9.2-p290

If you’re on Windows, RubyInstaller 1.9.2-p290 has been released.

If you compile your own version of Ruby, just grab one of the archive files listed in the official post and do your usual compilation shuffle. Nothing new there.

If you’re an RVM user, you’ll be glad to know the RVM team were on the ball and released an update within hours to support p290. Your upgrade steps are:

rvm get head
rvm reload

At this stage, you can either run rvm install ruby-1.9.2-p290 to install p290 from scratch, or if you’re already running p180 and wish to upgrade your existing environment, run rvm upgrade ruby-1.9.2-p180 ruby-1.9.2-p290 and you’re cooking with gas.

Some users have noticed that running rvm upgrade as above got an error where the wrong RVM executable was being run, but it seems to resolve itself if you open a fresh shell (despite running rvm reload) so try that out if you hit the “Unable to install” error.

13 New Ruby and Rails Jobs for July 2011

The Ruby and Rails job scene continues to grow through 2011 and we’ve got *drumroll* 13 (lucky for some) jobs to share from the Ruby Jobs board from companies like Simon & Schuster, AlphaSights and CustomInk. They’re all across the US with a couple in the UK for good measure.

To promote a job, see the Post A Job page. A bonus is your ad gets into the 6463 subscriber Ruby Weekly for free and our 5837-follower-strong @rubyinside Twitter account.

Braintree Seeks Internal Rails Developer – Chicago, IL

Braintree, a popular payment gateway provider and long-term user of Rails, is looking for an exceptional Rails developer. You’ll need at least one year of experience with Rails, TDD/BDD and working on Unix-like platforms (e.g. Linux). Git experience would be a plus too — click here to learn more.

Test Driven Ruby and Javascript Developer – SF and LA, California

Carbon Five builds web and mobile products for startups, institutional companies and non-profit organizations using a finely tuned agile process with cutting edge tools and technology. Join a team of seasoned pros in a highly-collaborative environment and work on a new project every few months — click here to learn more.

Ruby on Rails Developer – London, United Kingdom

AlphaSights is looking for a London-based Ruby on Rails Developer with a passion for Ruby and Rails to work in a small team at their Covent Garden offices. No commercial Rails experience is necessary but you need to be highly motivated — click here to learn more.

Ruby Programmer – Nutley, New Jersey

Xquizit is looking for a full-time Ruby and Rails developer with a MySQL background. The firm is upgrading technology from .Net to Rails. A salary of $125k p.a. is on offer — click here to learn more.

Ruby on Rails Developer for Major Publisher – New York, New York

Simon & Schuster is looking for a Ruby on Rails Developer to work in their Scrum/XP team in midtown Manhattan. Experience with JavaScript is necessary and experience with Redis, Objective C and C# are plusses — click here to learn more.

Rails Software Engineer – San Mateo, California

Coupa Software is the leading provider of cloud spend management solutions and they’re looking for a Rails Software Engineer to work with an agile, collaborative, and collegial team of accomplished software engineers, designers and business owners to develop and evolve the Coupa platform and applications — click here to learn more.

Ruby on Rails / Postgres Database Developer – Los Angeles, California

Oblong Industries is a tech company working on cutting edge ‘spatial’ interface systems (think Minority Report). They’re looking for a Ruby on Rails / Postgres Database Developer to work on their main product’s Web-facing SDK — click here to learn more.

Ruby on Rails Developer – Canton, Connecticut

Sports Technologies is a successful startup working on fantasy games and products for some of the biggest names in sports. They want a Ruby on Rails Developer to join their team and are offering $60-100k — click here to learn more.

Web Developer – Falls Church, Virginia

Demosphere International, Inc. is looking to fill an immediate full-time position for a Web Developer. Demosphere specializes in online solutions for youth, amateur, and professional sports organizations — click here to learn more.

Rails Developer for Consumer Web Apps – Silver Spring, Maryland

Webs is the leader in building websites for micro and small businesses, providing innovative platform, tools, and applications that members can use to easily build professional websites. They’re looking for a Rails Developerclick here to learn more.

Rails Developer – Henley-on-Thames, United Kingdom

Changework Now is an award-winning online recruitment solution provider looking for an enthusiastic Rails developer with over 1 year of experience of Rails, version control, and relational database design. They’re based in beautiful Henley, world famous for its rowing races and regattas — click here to learn more.

Sr. Ruby on Rails Developer – Fairfax, Virginia

CustomInk is an innovative custom product company offering hundreds of t-shirt styles and colors, golf balls, towels, and more for users to customize. They’re looking for a senior Ruby on Rails Developer to work on their critical customer-facing webapp — click here to learn more.

Ruby on Rails Developer – Boulder, Colorado

Foraker Labs is a Colorado-based Web and mobile app development consultancy with a serious eye for design and exciting apps. They’re looking for someone with a passion for programming and beautiful code. You’ll need significant Ruby, Rails, database design and TDD experience and it’s a full-time, on-site position in their Boulder, Colorado office — click here to learn more.

And that’s it! Want to keep up with Ruby jobs long-term? Check out the Rails Jobs Board – there are new jobs each week.

Old Testing Interviews

Back in January 2009, I did a bunch of interviews with prominent Rubyists about their test practices. The interviews vanished when I moved the site to WordPress, but I still get hits from a link to the interviews, and I thought it would be useful to get them all in one place.

Remember, this was 2009, and I’m sure everybody’s habits have changed since then. Other than putting them all together in one post, I haven’t edited these at all.

Noel Rappin

How did you get into writing tests regularly? Did you have a specific moment when you realized automated testing was valuable?

I started with automated testing very shortly after reading Kent Beck’s original XP book. The XP book came out at a time when I was very receptive to the ideas—I had just come off a project that had suffered from a lot of regressions and all the other kinds of pain that XP promised.

It didn’t take long for me to see how much better my code was when I did a lot of testing… what was more surprising was how much more fun writing code test-first turned out to be. The quick feedback and the ability to clean up code with confidence turned out to be really satisfying.

What is your Rails testing process? What kinds of tests do you write, and what tools do you use to write them?

When I’m adding a new feature, I tend to start with a skeleton controller test that validates that the controller sets the right variables. I don’t put much code in the controller, so then I move to testing the model for the new functionality. If the view logic seems to require it, then I’ll add view tests after I get the view in place. I go back and forth between test and code pretty quickly, but sometimes the code will get ahead of the tests, especially when doing view layer stuff. I don’t use integration tests very much at the moment.

I seem to have moved back into the core Rails test features recently, although I still use Shoulda for contexts and for the additional assertions. My most recent project used core Rails plus Matchy. I’ve also been using the various factory replacements for fixtures, which I like quite a bit.

What’s the most interesting thing you’ve discovered about testing recently?

The biggest change I’ve made recently is using factories for generating test data, which makes the tests much more readable and stable by keeping the setup closer to the actual test.

Is there a tool you wish you had for testing that you don’t think currently exists?

I wish I had a really good way of validating view logic, none of the ones I’ve tried have been completely satisfying. It’d also be nice to have more sophisticated coverage reports. Of course, these things might actually be impossible…

What advice would you give somebody looking to write more effective tests?

Automated testing is much easier and more valuable if you keep a tight feedback loop between your tests and your code.

Geoffrey Grosenbach

First up on the Testing Practices Interview series is Geoffrey Grosenbach, Senior Visionary of Topfunky, and also the person behind PeepCode. Geoffrey blogs at http://nubyonrails.com/, and is responsible for the gruff graph generator and, most recently, a task tracker based on David Seah’s Online CEO. He was also kind enough to write his responses in Textile, which I heartily endorse. Take it away…

How did you get into writing tests regularly? Did you have a specific moment when you realized automated testing was valuable?

I remember watching tests run during installation back when I was using Perl (wouldn’t that be nice if RubyGems optionally ran their test suite during installation?). But it seemed like an advanced topic that only some programmers did.

When I started using Ruby and met the testing fanatics at Seattle.rb, I started to understand what it was about and why I might want to write a test.

Three things helped me get started with test-driven development:

  • Watching other people do it.
  • Writing a graphics library (gruff). I needed to generate a bunch of samples with various data inputs and Test::Unit was a great way to do it.
  • Finally, I started out by writing a single test for an existing Rails app whenever I encountered a bug. It gave me peace of mind.

Since then, I’ve appreciated the process of thinking that I get into when I code test-first.

What is your Rails testing process? What kinds of tests do you write, and what tools do you use to write them?

I’ve tried several libraries and tools.

I started with Test::Unit and still use it on some existing projects.

For a while I used Jay Fields’ Unit-Record style of separating out unit and functional tests for both models and controllers (see also). It also provided a nice test method that took a string and a block of assertions (similar to what’s in Rails now).

I’ve used RSpec’s ability to add should syntax into Test::Unit.

I’m currently happy with straightforward RSpec. I also have some Test::Unit integration tests in my Rails apps but have also used RSpec User Stories and their replacement, Cucumber. Given the fact that I’m both the designer and coder of most of my apps, I don’t get much benefit from Cucumber. But I can see how it would be useful to people working for a semi-technical client.

Honestly, it’s a bit overwhelming with all the options out there!

When I’m actually coding, I’ll use Eric Hodel’s ubiquitous
autotest or rstakeout to run my test suite.

What’s the most interesting thing you’ve discovered about testing recently?

I’ve been working on a small command-line Objective-C app and am experimenting with using Ruby to run a suite against the command-line app to check the inputs and outputs. Ruby is useful that way and works better for me than trying to use Objective-C for the same purpose.

Is there a tool you wish you had for testing, but which you don’t think currently exists?

I can’t think of one. Usually I end up writing a tool if I need it and I can’t find it anywhere, such as test_benchmark to show individual test runtimes for Test::Unit. Someone recently imported it to GitHub as well.

What advice would you give somebody looking to write more effective tests?

Find a mentor. Work with someone else who is doing it. Concentrate on testing the effects of your code, not the way they are implemented. It’s easy to write tests that really don’t do anything and won’t reveal meaningful changes in the code if it stops behaving properly.

Gregg Pollack

Next up in the Testing Practices interview series is Gregg Pollack. Gregg is one of the proprietors of Rails Envy, and is one of the co-hosts of their Rails Envy Podcast, which mentioned Rails Prescriptions today—thanks Gregg!

Gregg is also one of the founding members of the Rails Activists. He’s also done a lot of video production, including some excellent ruby and rails screencasts, and his series condensing various Ruby and Rails conferences. Put it all that way, and he seems kind of busy, actually…

Take it away, Gregg…

How did you get into writing tests regularly? Did you have a specific moment when you realized automated testing was valuable?

I have to admit, my first 3 Rails projects didn’t contain any tests. I got into testing when RSpec started on the scene. RSpec just made more sense and the documentation on the RSpec website was really useful. Once I figured out how to use RSpec with autotest and growl, testing became much more fun.

I also ended up doing a talk at my local users group about how I came to love testing. It’s a little dated, but still quite relevant:

http://www.railsenvy.com/2007/10/4/how-i-learned-to-love-testing-presentation

What is your Rails testing process? What kinds of tests do you write, and what tools do you use to write them?

I used to write isolated tests at the model and controller level using RSpec, properly mocking and stubbing at the controller level. Often times integration tests would get ignored.

Currently I’m working on a project using Shoulda for Model tests and Integration tests with Webrat. Yes, we’re not doing controller or view tests. So far I’m really liking the combination, and it seems to cover things pretty well without having to deal with much mocking and stubbing of controllers/view tests which have very little logic anyways.

What’s the most interesting thing you’ve discovered about testing recently?

Webrat rocks for integration tests.

Is there a tool you wish you had for testing, but which you don’t think currently exists?

Hmmm.. James Golick recently blogged about one line tests using a library called Zebra. He also argues that test names are essentially comments, and well.. too many comments are a code smell.

So what’s the alternative?

Creating a DSL that allows us to write tests in a way that doesn’t need names/comments. Shoulda gets us pretty close with all of it’s one line helpers like:

should_belong_to :user
should_have_many :tags, :through => :taggings
should_require_unique_attributes :title
should_require_attributes
should_only_allow_numeric_values_for :user_id

None of these tests need comments to figure out what they’re testing! I’d love to figure out a way to do more of this, and the Zebra library James put out is certainly a step in the right direction.

What advice would you give somebody looking to write more effective tests?

Integration tests are probably more important then anything else, and using a library like Webrat makes them very easy to do.

If you’re working with a team of people and you want to ensure you’re building a solid test library then a Continuous Integration server is imperative. Set it to run all your tests every time something is checked in and if it fails to email everyone. Make a rule that whoever checks in code that causes tests to fail has to buy everyone a round of beer, or has to perform some humiliating task. Checked in code that fails should be fixed immediately.

Ryan Bates

Next up on the testing interviews is Ryan Bates. Ryan runs Railscasts, a weekly screencast on a new Rails topic that is simply one of the best ongoing sources for tutorials about Rails. Seriously, if you aren’t familiar with it, drop everything and prepare to spend some time watching his videos. Ryan has also done two screencast series for Pragmatic, Everyday Active Record and Mastering Rails Forms, both of which are available at pragmatic.tv.

On a related topic, I didn’t introduce myself to Ryan at last year’s RailsConf, even though I walked past him a few times. This is because every time I walked by him, people were coming up to him and thanking for all the great Railscasts. So, thanks Ryan.

How did you get into writing tests regularly? Did you have a specific moment when you realized automated testing was valuable?

About 6 years ago I read the book Extreme Programming by Kent Beck. This sparked my interest in testing (specifically TDD), but I could not find many practical examples of the practice. I spent some time researching the topic but honestly did not “get” it until Rails came along. Rails provided practical testing patterns which were fairly easy to follow.

I find the most difficult part of testing is coming up with a pattern which works well for a specific situation. Once that is done, adding similar tests becomes very easy, and that is where it starts to really pay off.

What is your Rails testing process? What kinds of tests do you write, and what tools do you use to write them?

I primarily use RSpec for testing, however my tests are quite different than what they recommend. For one thing, I use controller tests like functional and integration tests. That is, they execute the entire stack (including models and views). I don’t test views exclusively beyond this because I find that requires too many mocks and leads to brittle tests.

I test models and helpers exclusively (like unit tests). That is, I test each method on its own to ensure it functions properly. My theory is, the more deeper a piece of code is, then the more it is used by various parts of the application, and therefore should have better test coverage.

Fixtures are kind of interesting. I don’t use them at all in unit tests, instead I prefer factories (factory_girl) for this. However I do use fixtures in controller tests as filler data to help catch errors. Each fixture usually does not have more than two records, and I often leave them with their default generated content.

What’s the most interesting thing you’ve discovered about testing recently?

I just recently discovered the RR mocking framework and I like its syntax more than Mocha. However I haven’t moved many projects over to it yet, so I can’t say how well it works in real world use.

Is there a tool you wish you had for testing, but which you don’t think currently exists?

Perhaps a tool for testing private methods and accessing instance variables in a clean way. I know in theory tests shouldn’t need to do this, but I would still find it useful. There may already be one out there, I haven’t looked much.

Beyond this I would love to see more documentation, examples and experiments done with regard to testing. I still feel it is a fairly unfamiliar territory in the programming world, and everyone seems to have their own way of doing things.

What advice would you give somebody looking to write more effective tests?

Be careful with mocks and stubs. They are often an easy, immediate solution but can lead to brittle or deceiving tests. Only use them if you can find no other cleaner way to test something.

Overall, don’t give up on testing if you don’t grasp it right away. Try a technique for a while, if it doesn’t work with your flow, try something else. Don’t feel bad if you find testing is hard – it really is. But it is so worth it.

James Golick

On to the interivew. James Golick is the founder of GiraffeSoft, a boutique Rails consulting firm out of Montreal. He also maintains the James on Software blog. Within the Rails community, he’s the author of resource_controller, a common parent for RESTful controllers, and active_presenter, which allows you to create presenter objects out of aggregations of ActiveRecord models.

Most recently, James has created zebra, a test library for the quick creation of single line tests.

How did you get into writing tests regularly? Did you have a specific moment when you realized automated testing was valuable?

I worked at a few really cheaply run companies, where I was the one man technology department. I didn’t sleep much in those days.

In those days, I was always desperately looking for ways to get better at what I was doing, if only to get the occasional extra hour of sleep. At one point, I came accross an article about automated testing and TDD, and it all just made so much sense to me. When you’re that strapped for time, automation is a necessity. Once I was introduced to the idea that tests could be automated, too, I jumped all over it.

What is your Rails testing process? What kinds of tests do you write, and what tools do you use to write them?

At giraffesoft, we’re very serious about TDD. So, naturally, everything we do is test-first.

Currently, we write extensive unit tests and functional tests. We’re slowly adding cucumber to the mix. I’m definitely sold on the benefits of acceptance test driven development. So, that’s where we’re going.

We use Shoulda, largely because of its macros. Duplication in tests becomes incredibly tedious. So, expressing certain kinds of repetitive tests as one-liners is a huge win.

However, as I mentioned in the release announcement for zebra, we’ve started to feel the burn with Shoulda’s macros in certain situations. Often, test failures result in completely useless backtraces and the string programming catches up to you after a while.

So, we’re currently moving towards a context & zebra stack to replace Shoulda.

What’s the most interesting thing you’ve discovered about testing recently?

Using Jay Fieldsexpectations testing framework completely changed the way that I approach unit tests. Not having to describe each test in english is incredibly liberating.

If your code is readable, you shouldn’t need to document it at a micro level, except in special cases. Your code probably isn’t littered with comments describing every couple of lines. So, why should your tests be?

Is there a tool you wish you had for testing, but which you don’t think currently exists?

If you’d asked me 2 weeks ago, I’d have replied, something that would allow me to write more expectations-like tests in my every day hacking. That’s what zebra is. So, now I’m feeling pretty happy about my toolset and where things are headed.

What advice would you give somebody looking to write more effective tests?

Be pragmatic. I get a lot of questions about the “proper” way to mock or stub something, for example. Stop worrying about getting things “right” and try to make judgements that get you the best possible test, while striking a balance between productivity and the longevity of the test.

Additionally, I’d really encourage people to use expectations for a small project – a rails plugin or something. For various reasons, you probably won’t want to use it for every day stuff, but you might learn a lot from giving it a serious look.

Chad Fowler

I’m very excited to have Chad Fowler as the latest participant in the Testing Practice interview series.

Chad Fowler is an internationally known software developer, trainer, manager, speaker, and musician. Over the past decade he has worked with some of the world’s largest companies and most admired software developers. He loves to program computers and, as part of his role as CTO of InfoEther, Inc., spends much of his time solving hard problems for customers in the Ruby language. He is co-organizer of RubyConf, and RailsConf and author or co-author of a number of popular software books.

And so, let’s hear from Chad…

How did you get into writing tests regularly? Did you have a specific moment when you realized automated testing was valuable?

For me it was when I was exposed to TDD for the first time. I had been practicing the typical “guru reads the output” style of testing, primarily in Java at the time. That means every class typically had a main() method at the bottom which would exercise my code and print a bunch of hardly decipherable junk to the screen. As I was developing, I knew what that junk meant. Nobody else ever did. Two days later, neither did I. But the main() methods remained. Because, hey, those were the tests.

I think the way most people worked in an environment like that is that if they needed to change anything in the production code, they would hack the existing “tests” in the main() method such that they could understand what they meant (since this stuff was embedded in production code, you didn’t typically venture out of main() for fear of polluting the namespace). So it was more of a scratchpad than a test.

At the same time, I was doing a lot of mentoring of junior developers on how to do object oriented programming. Specifically, I was trying to help a large group of developers stop generating unmaintainable spaghetti Java. I started using a technique that I call Error Driven Development. It was very much like TDD, which I had not yet heard of. You start in the area of the application you’re trying to implement. Maybe it’s a controller in an MVC setup, for example. And you type in the code you wish already existed as an API. You express your intent as clearly and succinctly as possible within the scope of what would be technically possible in this magical but not yet existing API (paying attention to which data needs to be passed around as parameters, which imaginary objects would best play the role of owning which function, etc.). Then you try to compile and/or run your code and follow the error stream until everything works.

With this kind of development, you get to always work at the level of abstraction you’re interested in while you’re developing. When I’m in a controller, I don’t care about SQL. When I’m in a view, I don’t care how business logic is implemented. To make everything run, of course, I have to walk down through the layers of abstraction and repeat this process—-imagining the perfect API and just using it. Eventually I’m done. It’s motivating and makes better code.

So, the question was about testing. This Error Driven Development technique was just a lame version of how good TDD works. TDD gives you the extra advantage of a nice framework with assertions and reports. So the first time I saw TDD, I was hooked. It was a better version of what I’d been working toward in my quest to write and help others write better code. It just happened to also create tests as a side effect.

What is your Rails testing process? What kinds of tests do you write, and what tools do you use to write them?

I use whatever testing framework and tools the team I’m working with uses. I’ve done everything from out-of-the-box test/unit to RSpec. Given a choice these days, I’d probably depend on Shoulda and Mocha. RR is really interesting as well but I haven’t fully switched to it. I suspect I might replace Mocha with it at some point in the near future.

I typically start by writing model tests. They’re called unit tests in Rails, but they’re rarely isolated unit tests. Ideally, as much logic as possible lives in the models, so I spend a lot of time here. By far more than anywhere else. I do everything test-driven unless I’m hacking something together quickly. Even then, I usually reach a point where I wish I was doing things test-driven and switch.

Ideally, your controllers are going to be tiny. They’re also likely to be composed of calls to objects and methods from your model layer. So, though I do functional (controller) tests, I try to minimize the need for them. If you’re already testing a method in a model, you don’t need to duplicate that test in a controller if the controller is simply calling the model. You do need to make sure the controller is doing its job well. That’s where things like mocks come in. I use mocks not so much to avoid calling code in the dependencies an object has (such as the classic credit card processor example) but more to allow me to specify a process in terms of what the code I’m testing is supposed to do vs. how it does it.

Ultimately, it’s a good idea for someone to actually write real automated tests. These aren’t the ones I do unless I’m playing the role of tester. For example, Selenium is an excellent tool for really testing a Rails app. It’s great to have automated tests at that level that run with a continuous build and so on. I don’t usually actually do that so it’s not part of my process per se, but I’d always advocate that it be part of a team’s process. I don’t get much value out of trying to do selenium-first development, though, for example.

What’s the most interesting thing you’ve discovered about testing recently?

I wouldn’t call it a discovery, but I’m starting to change a long held opinion about the value of tests. I used to think of them as executable requirements specifications. I know a lot of people do. I’m from the school of TDD that blossomed into what’s now called BDD. BDD people even tend to use the word “specify” when they talk about tests they write up front.

That always sounded like a great idea. Requirements docs you can execute for validity. In my experience it rarely works out that way, though. Tests don’t end up being readable like english. RSpec and especially Cucumber are a step in that direction, but I’m starting to believe that ultimately developer tests should read like good code. Maybe a little different from normal good code, but they are code after all. And they’re not for validating the production code you write. They’re for motivating the production code you write.

So maybe the goals of executable requirements documentation and motivational specs are at odds with each other. And if the tests do the job of driving you to make well designed code but don’t really read like requirements documentation afterward, that’s nothing to feel bad about.

Maybe it’s even OK to throw away the tests eventually in the same way we used to throw away the code in our main()-method scratch pads.

I’m pretty sure I’m exaggerating, but that’s the general idea.

Is there a tool you wish you had for testing, but which you don’t think currently exists?

Not really. Marcel Molina and I used to talk about how we longed for something that would sit in the middle of our objects and their method invocations, such that you could set up how you expect an object to behave and then verify, without overriding the method’s behaviors, that the object did the things you asked. Like a partial mock which doesn’t change the method implementations it’s setting expectations for.

RR now does this with its proxy implementation. Now that it exists, though I like the feature, it doesn’t feel like it filled a big hole. Go figure.

What advice would you give somebody looking to write more effective tests?

Assume that writing more effective tests means writing better code. Developer testing (which is what I do) isn’t primarily about verifying code. It’s about making great code. If you can’t test something, it might be your testing skills failing you but it’s probably your code code’s design. Testable code is almost always better code. Code written as a result of a series of failing tests is very likely (by definition) testable. (An example of a design choice in Ruby that might result from this sort of approach is to use more mixins and less inheritance.)

That being said, it’s easy to fall into a trap when you start coding this way. I often come across heavily TDD’d code with a huge focus on some trivial but easily tested feature. For example, you might start writing tests for the Ruby comparison method (<=>) or to_s on an object, simply because you know how to write tests for it. I’ve seen programmers spend a disproportionate amount of time testing and implementing ancillary features because they get the TDD (or BDD) bug and find a comfort zone to hang out in.

Instead, always focus on testing the core of your domain. Other tests are nice to have, but when you focus on the core of your domain (the features that define the product you’re implementing), you drive that domain model forward and avoid spinning your wheels on testing for the sake of testing. Kent Beck used to say “test everything that could possibly break”. That’s good advice, but I’d add “and actually matters”.

Filed under: Testing Interview

July 15, 2011: Stale Links

The problem with sitting on these daily link posts is that the links go out of date. Sigh. Here are some links.

Twitter

I found a couple of things about this InfoQ article about Twitter’s infrastructure odd. I was expecting it to be a bit more of a Rails hit-piece, frankly, so it was nice to see a quote like this one from Evan Weaver:

I wouldn’t say that Rails has served as poorly in any way, it’s just that we outgrew it very quickly.

Twitter has unique needs, so it’s not surprising that the Rails stack doesn’t serve them anymore, but they did get pretty far with the Rails stack.

This was interesting – first, from Charles Humble, writing the article:

You might assume that the move to the JVM was largely driven by performance and scalability concerns, but in fact the existing Twitter codebase performs well… Rather, the move to JVM is driven as much by a need for better developer productivity as it it for better performance

And this from Weaver:

As we move into a light-weight Service Oriented Architecture model, static typing becomes a genuine productivity boon.

The author concludes the article with this:

[Rails] does however come with well known costs, both in terms of performance and scalability, and perhaps also the relative maturity of the libraries and tool chain. In addition, the experience at Twitter suggests that the Ruby on Rails stack can produce some significant architectural challenges as the code base grows.

Which strikes me as an overgeneralization of what Weaver said – I’m almost willing to believe that static typing is a benefit if you are doing SOA at Twitter’s scale, but I haven’t seen the benefit on smaller projects in my experience.

Amazon

As somebody who got their Amazon Affiliate account zapped when Amazon pulled the rug out from under Illinois residents, I was following with some interest the similar news out of California. (Although not quite identical, I had several months notice).

There’s been a little bit of confusion on what actually happened – a lot of people seem to think California is trying to tax affiliate revenue (I can’t find the link, but I saw someone argue that their affiliate revenue was already being taxed so shouldn’t be taxed again, which is wrong in a couple of different ways.) Slate magazine has a decent overview, which I’ll note I basically agree with on the substance of the issue.

Current law is that online transactions are only subject to sales tax if the company involved has a physical presence in the state. The California law defines “presence” to include any affiliates who get a payout – the affiliate revenue isn’t taxed as such, but the existence of an affiliate means that other Amazon transactions in California would be subject to California sales tax. Amazon responded to the law by canceling all their affiliates in California, as they did in Illinois, to avoid having to charge sales tax, and also to avoid having to calculate and manage sales tax, and also to avoid a court case that they might well lose.

Anyway, you may agree or disagree with the California law – though it doesn’t seem inherently any less silly than the various state laws that impose income taxes in visiting professional athletes. For my part, I don’t understand why the fact that Amazon has put themselves in a position where paying sales takes kills their business model should be my problem – I understand why there was an initial push not to charge sales tax on the internet, but I think the social benefit of privileging on line sales has probably passed. Even if you don’t agree with that argument, though, it’s hard for me to see how Amazon using their affiliates as pawns is the best or most responsible way for them to be advocating their case.

New additions to the workflow

I’ve got a couple of new writing workflow things to mention. There’s a new app in the Mac App Store called Marked, which is a classic “One Thing Well” deal. For $2.99, it’s basically a Markdown preview window, but has the very useful feature that it will live-update a file you are editing every time you save. So it’s basically adding MarsEdit’s preview window to any editor. It also makes it easy to copy the resulting HTML into the clipboard if you, say, want to post it to WordPress. It also lets you change the Markdown processor if you’d like. It’s nicely handy for $2.99.

On the iPad side, WriteRoom has finally been updated to a universal app. It’s effectively PlainText Pro – the same basic (pretty) layout with a couple of extra features. It’s got an easy to configure extra keyboard row, and a couple of other handy features. My main negative is that, when the app is in landscape mode it doesn’t use all the horizontal space for text, that’d be a useful option. One thing I like about it, relative to other editors is that it live-syncs with Dropbox, giving much more of a feel of directly editing the Dropbox file than the other editors that make you download the file locally and manually sync. Overall, though I like it.

I also tried out an iPad app called Daedalus, which has a very interesting UI metaphor but doesn’t really fit with the way I manage files. If you are willing to do all your notes and writing in it, though, the organization looks like it might be handy.

RubyMine 3.2

Quick mention that RubyMine 3.2 is out, with support for Rails 3.1 features like the asset pipeline and CoffeeScript. Mostly, I’m having some stability problems with it (it tends to freeze up for me), but the editor and its interaction with Rails continues to get better.

Avdi on Law of Demeter

Finally, speaking of things I thought I was going to disagree with, but wound up agreeing with almost completely (and also speaking of week-old links…), here’s Avdi Grimm on the Law of Demeter. Avdi comes down on the side of actually useful guidelines for managing chains of method calls.

Filed under: Amazon, iPad, Ruby, RubyMine, Twitter, Uncategorized