Release of new database plans on August 1st

We are happy to announce that our new line-up of database plans are being released on August 1st. The dev, basic, crane, and kappa plans make many of the most exciting features of our fully-managed database service available to a wider audience. They are now ready for all users.

We will also begin billing for these plans as of August 1st. If you have been beta testing one of these databases and do not wish to incur charges for it going forward, please remove it immediately via the web interface or the command line:

heroku addons:remove HEROKU_POSTGRESQL_COLOR --app app_name

If you have been waiting to use these plans because they have been in beta, then your wait is (almost) over. They can be provisioned now by all users via the web interface or the command line:

heroku addons:add heroku-postgresql:[dev | basic | crane | kappa]

Starter Tier

The dev plan (free) brings many of the best features of our production database plans to development users. This includes Postgres 9.1, data clips, hstore schemaless SQL, direct psql access, support for most pg commands from the Heroku client, a web interface, support for multiple databases connected to a single application, and Continuous Protection. The dev databases are limited to 10,000 total rows. For users that need to store more data, the basic plan ($9 / month) raises the row limit to 10 million rows but does not increase availability or add any additional features.

The dev and basic plans both belong to our Starter tier. These plans are designed to provide 99.5% availability and are ideal for trial, development, testing, and other basic usage. For serious production applications, we recommend using one of our Production plans, designed for 99.95% availability. Please note that these are design parameters, not SLAs, and availability can be further increased by taking advantage of followers.

Production Tier

With this release, the Production tier is expanded to include crane ($50 / month) and kappa ($100 / month). These will also be released on August 1st. They offer all of the same features as our other production databases at an incredible price point. These benefits include production-grade monitoring and operations, as well as support for fork, follow, auto backups, and fast changeovers / upgrades.

Migration From Legacy Shared Databases

For those users that are still using the legacy shared-database plan, we encourage you to upgrade as soon as possible. We will be announcing a deprecation and migration schedule for these plans shortly. We will be working to migrate all users onto these new plans, but we encourage you to move as soon as possible to enjoy the advantages of these improvements. You can also opt-in to creating the dev plan by default for all new applications by enabling the Heroku Labs flag.

If for any reason the scheduled release of these plans causes hardship for your business, please open a support ticket so that we can individually address your needs.

#368 MiniProfiler

MiniProfiler allows you to see the speed of a request conveniently on the page. It also shows the SQL queries performed and allows you to profile a specific block of code.

Minimal I18n with Rails 3.2

Minimal I18n with Rails 3.2

This guest post is by Fabio Akita, also known as akitaonrails. He is a known Brazilian Ruby Activist and has been the program chairman for Rubyconf Brazil 2012 for the last 5 years. He also co-founded Codeminer 42, a software boutique specialized in taking care of outsourced work from fledgling startups that need great Rails developers. Fabio has been publicly evangelizing Ruby, Rails and agile techniques since 2006 and has talked around 100 times in conferences around the globe.

Fabio Akita If you don’t know me, I’m natural from Brazil where we speak Brazilian Portuguese. If you’re from outside of the USA, it’s likely that you bump into the same issues as I do when writing apps that wants to achieve worldwide repercussion: internationalization and localization. Problem is, most developers are careless about it and start writing code with English and Portuguese all mixed up. And when the time comes to explicitly support both, we have to go deep intervention in the code to extract all the particular language bits into manageable structures.

Even though both Ruby and Ruby on Rails have gone through lots of improvements in this regard, several developers are still uncertain on how to properly use those features. One thing in particular, when talking about multi-cultural apps, there is more to it than just translating strings. Bear in mind that there is both Localization (L10n) and Internationalization (I18n). I won’t go too deep into the matters of L10n but if you’re building the next multi-cultural app, keep that in mind.

I’ve posted all the code I’ll use in this article to my Github account, you can check it out here and you can also see a live version at my Heroku free account here.

Let’s start with the basics:

Database and string codification

I don’t intend to repeat all that has been discussed in the past about encodings, unicode, UTF8 and everything that is now properly and fully supported on Ruby 1.9. If you didn’t follow that thread, I highly recommend you start reading Yehuda Katz’s great articles:

If you’re from countries that have English as the natural language, keep in mind one thing about Unicode and Latin1 encodings – from Wikipedia:

To allow backward compatibility, the 128 ASCII and 256 ISO-8859-1 (Latin 1) characters are assigned Unicode/UCS code points that are the same as their codes in the earlier standards. Therefore, ASCII can be considered a 7-bit encoding scheme for a very small subset of Unicode/UCS, and, conversely, the UTF-8 encoding forms are binary-compatible with ASCII for code points below 128, meaning all ASCII is valid UTF-8. The other encoding forms resemble ASCII in how they represent the first 128 characters of Unicode, but use 16 or 32 bits per character, so they require conversion for compatibility (similarly UCS-2 is upwards compatible with UTF-16).

Bottomline is that if you forget to deal with UTF8 and fallback to Latin1, you won’t notice for a long time. Most modern systems: databases, text editors, etc already default to UTF8, but some don’t. First things first: make sure you’re saving your source code files as UTF8. Second: make sure your database was created with UTF8 support. For example, if you create your Rails app databases using the standard rake db:create, you’re safe to have it as UTF8 but if you create them manually using your database command line tool, enforce UTF8. On MySQL you must do:

    CREATE DATABASE dbname
      CHARACTER SET utf8
      COLLATE utf8_general_ci;

On PostgreSQL you must do:

    CREATE DATABASE dbname
      WITH OWNER "postgres"
      ENCODING 'UTF8'
      LC_COLLATE = 'en_US.UTF-8'
      LC_CTYPE = 'en_US.UTF-8';
  

Obviously, change dbname and postgres accordingly. Don’t mix up! If you’re dealing with text, make sure your code, Ruby gems you depend on, all use UTF8. It was a much harder experience 2 years ago, but now that the community has committed to Ruby 1.9, you won’t notice it most of the time.

About the source code, even if you save your file as UTF8 you have to take one extra care. If you’re writing text in languages that need special characters, you must start your file with one the following lines:

    # encoding: UTF-8
    # coding: UTF-8
    # -*- coding: UTF-8 -*-
    # -*- coding: utf-8 -*-
  

Choose one and use just one, they all work the same and they instruct the Ruby interpreter to properly handle the special characters. Ruby will warn you of that if you try to run source code with non-English characters in it.

But I might add that most of the time, in a Rails app, having to add one of these lines can be considered a “code smell”. That’s because you should’ve extracted that non-English text into external i18n files and your Ruby code should be free of language-specific text. So use this is you must, but in everyday programming you should extract those strings.

And an extra recommendation: people sometimes discuss whether we should write the code itself in our native languages or default to English. I hardly recommend that you must default to English for things such as class names, methods names, variables names, even documentation in comments within the code. We live in a globalized world and the market has already defaulted to English, so keep the pseudo-patriotic discussions for other places. In code you write in English. You never know when a foreigner might join your team. You never know when you will have to join a foreign team. Do not limit neither your code nor yourself.

Starting a new Rails app

I’ll assume you already at least the very basics on how to bootstrap a new Rails app. The official Rails support for I18n started at Rails 2.2, and the great Rails Guides has a very good introduction on Rails Internationalization API. I’ll assume that you read and understood it all so not to repeat what’s already nicely explained there. The idea is to enhance on some of the points that I feel people still have a hard time dealing with.

L10n wise, you should start customizing your app by modifying the config/application.rb, approximately around line 28 to become something like the following snippet:

    # Set Time.zone default to the specified zone and make Active Record auto-convert to this zone.
    # Run "rake -D time" for a list of tasks for finding time zone names. Default is UTC.
    config.time_zone = 'Brasilia'

    # The default locale is :en and all translations from config/locales/*.rb,yml are auto loaded.
    # config.i18n.load_path += Dir[Rails.root.join('my', 'locales', '*.{rb,yml}').to_s]
    config.i18n.available_locales = [:en, :"pt-BR"]
    config.i18n.default_locale = :"pt-BR"

    # Configure the default encoding used in templates for Ruby 1.9.
    config.encoding = "utf-8"
  

Throughout this article, I’ll use Brazilian Portuguese and Brazil as an example non-English language and culture. You have to change it accordingly to your country. Time zone is one point that always confuses everybody, but the bottom line is that your database should always record date and time in UTC, the Greenwich GMT-0. I live in the “Brazilia”, which is GMT-3. That means that while in Greenwhich it’s noon, in Brazil it’s 9 AM. And I have an extra problem: my country is big enough to have 3 different time zones and Daylight Savings. Rails’ ActiveSupport already does a decent job overriding what it must in order for you to be able to operate on dates and times regardless of their time zones because all basic operations goes through UTC.

Take this code (running within Rails console to have ActiveSupport already activated):

    Time.zone = 'Brasilia'
    => "Brasilia"
    t1 = Time.zone.local(2012,7,13,12,0,0)
    => Fri, 13 Jul 2012 12:00:00 BRT -03:00

    Time.zone = 'Tokyo'
    => "Tokyo"
    t2 = Time.zone.local(2012,7,13,12,0,0)
    => Fri, 13 Jul 2012 12:00:00 JST +09:00

    [21] pry(main)> t1 - t2
    => 43200.0
    [25] pry(main)> (t1 - t2) / 1.day
    => 0.5
  

We are using the exact same input date and time, 7/13/2012 12:00:00 PM. But when we create 2 Time objects using different time zones, you can see that the subtraction of both objects gives a 12 hours difference (which is the actual time difference between Brazil and Japan). Now, you can have people on both countries write in their local times and have operations that respect that difference.

But I digress. Coming back to Rails i18n support, you will read in the guides that the default location for translated strings is within config/locales. And you can have 2 difference kinds of files: Ruby or YAML. I recommend using YAML files but this is more a personal taste. You can even mix locale files in YAML and Ruby.

Now, Rails itself is internationalized, defaulting to English. So all ActiveRecord’s validation messages, for example, are already properly extracted. One Rubyist that have been pitching about i18n support a long time ago is Sven Fuchs and he maintains a repository of i18n goodies for you to explore called rails-i18n. There you will find the files needed to translate the Rails framework itself. And if your country/language is not there, please contribute back.

In my case, I’m interested in the Brazilian Portuguese translations, you can download it like this:

    curl https://raw.github.com/svenfuchs/rails-i18n/master/rails/locale/pt-BR.yml > config/locales/rails.pt-BR.yml

Those locale files don’t only add translated strings, it also starts the basics of L10n by properly adding data formats. Check out this example view template in my demonstration app.

Rails commands Output in English Output in Brazilian Portuguese
number_to_currency(123.56) $123.56 R$ 123,56
number_to_human(100_555_123.15) 101 Million 100 milhões
I18n.l(Time.current, format: :long) July 23, 2012 22:26 Segunda, 23 de Julho de 2012, 22:25 h
distance_of_time_in_words(1.hour + 20.minutes) about 1 hour aproximadamente 1 hora

You can see that Rails already does a lot of heavy lifting for you, so don’t put all that effort to waste.

Devise

Most web apps that have user authentication use Devise. If you want to learn more check out Ryan Bates’ awesome screencasts:

The same as Rails, Devise also has extracted its internal strings and is fully internationalizable. Check out it’s Wiki about i18n for more details. But you can start by downloading your translated files from Christopher Dell’s project, like this:

    curl https://raw.github.com/tigrish/devise-i18n/master/locales/en-US.yml > config/locales/devise.en.yml
    curl https://github.com/tigrish/devise-i18n/blob/master/locales/pt-BR.yml > config/locales/devise.pt-BR.yml
  

But if you want to have everything translated, you have to go the extra mile and actually use Devise’s generator to clone its view templates within your Rails app by running rails g devise:views. This will copy the templates in app/views/devise. Keep the templates you want and translate all of them. As an example, take the resend confirmation template:

    <h2>Resend confirmation instructions</h2>

    <%= form_for(resource, :as => resource_name, :url => confirmation_path(resource_name), :html => { :method => :post }) do |f| %>
      <%= devise_error_messages! %>

      <div><%= f.label :email %><br />
      <%= f.email_field :email %></div>

      <div><%= f.submit "Resend confirmation instructions" %></div>
    <% end %>

    <%= render "devise/shared/links" %>

You have to extract them manually. In the case of Brazilian Portuguese I have already done the heavy lifting myself, you can download them from my demonstration project and replace the originals. Don’t forget to also download the YAML file:

    wget https://raw.github.com/akitaonrails/Rails-3-I18n-Demonstration/master/config/locales/devise.views.en.yml > config/locales/devise.views.en.yml
    wget https://raw.github.com/akitaonrails/Rails-3-I18n-Demonstration/master/config/locales/devise.views.pt-BR.yml > config/locales/devise.views.pt-BR.yml
  

This should take care of the view templates, but you also have to take care of Rails’ Form Helpers properly translating your model attributes. The Rails Guides quickly explain that, but in summary you have to have something similar to the following snippets in your config/locales file:

    activemodel:
      errors:
        <<: *errors
    activerecord:
      errors:
        <<: *errors
      models:
        user: "Usuário"
        article: "Artigo"
      attributes:
        user:
          email: "E-mail"
          password: "Senha"
          password_confirmation: "Confirmar Senha"
          current_password: "Senha Atual"
          remember_me: "Lembre-se de mim"
        article:
          title: "Título"
          body: "Conteúdo"
          body_html: "Conteúdo em HTML"
  

The User model is what Devise creates for you by default. As an added example, there is a Article model. The code should speak for itself. You translate the model class name in activerecord.models and the attributes in activerecord.attributes.[model].

The extra mile on database tables with Globalize 3

We took care of most of the structural translations already but you still have your user generated content. If you will have an application that users from around the world can use, maybe you may want to have content that reflects each user’s language. The concept is quite simple: each content :has_many translations.

O conceito é simples: queremos um suporte que me permita utilizar os mesmos nomes de atributos mas que devolvam valores diferntes dependendo da localização escolhida atualmente. If we would add an Rspec spec to cover this behavior, it would look like this:

    describe Article do
      before(:each) do
        I18n.locale = :en
        @article = Article.create title: "Hello World", body: "Test"
        I18n.locale = :"pt-BR"
        @article.update_attributes(title: "Ola Mundo", body: "Teste")
      end

      context "translations" do
        it "should read the correct translation" do
          @article = Article.last

          I18n.locale = :en
          @article.title.should == "Hello World"
          @article.body.should == "Test"

          I18n.locale = :"pt-BR"
          @article.title.should == "Ola Mundo"
          @article.body.should == "Teste"
        end
      end
    end
  

I chose to use Sven Fuchs’ Globalize 3 gem. Add that to your Gemfile as gem 'globalize3', run the bundle command and you’re good to go.

If you already have a Article model in your app, you should add a new migration like this:

    class CreateArticles < ActiveRecord::Migration
      def up
        create_table :articles do |t|
          t.string :slug, null: false
          t.timestamps
        end
        add_index :articles, :slug, unique: true

        Article.create_translation_table! :title => :string, :body => :text
      end

      def down
        drop_table :articles
        Article.drop_translation_table!
      end
    end
  

Do not use Rails 3′s new change migration method. After that just migrate your database and let’s go back to the Article model:

    class Article < ActiveRecord::Base
      attr_accessible :slug, :title, :body, :locale, :translations_attributes

      translates :title, :body
      accepts_nested_attributes_for :translations

      class Translation
        attr_accessible :locale, :title, :body
      end
    end
  

Don’t mind the table created in the migration, you will use the Article model as usual. It will detect the current I18n.locale and save the content in the proper fields. Changing the current locale makes it query the different translations.

Managing your Globalized content with ActiveAdmin

Whenever I need an administration section, my first choice is to use the formidable Active Admin, it has a clean neutral design that my clients enjoy, it’s easy to use, and easily customizable. If you have a model associated with a CarrierWave uploader, for example, it will automatically show a file input attribute and that’s because it’s using Formtastic underneath to assemble the forms automatically. Read Active Admin’s documentation to understand how to get started.

Now, to support a Globalize 3 extended model we will need some more tweaking. First of all let’s add additional gems to the Gemfile to help:

    ...
    group :assets do
      gem 'jquery-ui-rails'
      ...
    end
    ...
    gem 'jquery-rails'
    gem 'activeadmin'
    gem 'ActiveAdmin-Globalize3-inputs'
    ...
  

Now, we need to tell Active Admin to handle the Article model. We do that by creating a app/admin/article.rb file like this:

    ActiveAdmin.register Article do
      index do
        column :id
        column :slug
        column :title

        default_actions
      end

      show do |article|
        attributes_table do
          row :slug
          I18n.available_locales.each do |locale|
            h3 I18n.t(locale, scope: ["translation"])
            div do
              h4 article.translations.where(locale: locale).first.title
            end
          end
        end
        active_admin_comments
      end
      ...
    end

  

The index block is quite standard. Now the show block is interesting as we are accessing the translations association from the Article model directly. We iterate through each supported translation, as defined in config/application.rb.

I’m using ActiveAdmin-Globalize3-inputs, which is turn depends on JQuery UI to adapt the administration form to use tabs for each locale.

Then we take advantage of ActiveRecord’s ability to handle mass assigned nested attributes through accepts_nested_attributes_for. To take advantage of this feature, we need to edit our Article model like this:

    class Article < ActiveRecord::Base
      attr_accessible :body, :slug, :title, :locale, :translations_attributes
      ...
      translates :title, :body
      accepts_nested_attributes_for :translations
      ...
      class Translation
        attr_accessible :locale, :title, :body
      end

      def translations_attributes=(attributes)
        new_translations = attributes.values.reduce({}) do |new_values, translation|
          new_values.merge! translation.delete("locale") => translation
        end
        set_translations new_translations
      end
      ...
    end

  

Now we need to make sure JQuery UI is available by modifying app/assets/stylesheets/active_admin.css like this:

    // Active Admin CSS Styles
    @import "active_admin/mixins";
    @import "active_admin/base";
    @import "jquery.ui.tabs";
  

And also modify the app/assets/javascripts/active_admin.js like this:

    //= require active_admin/base
    //= require jquery.ui.tabs
  

Finally, there is a last bit that we need to add to the end of the app/admin/articles.rb file:

    ActiveAdmin.register Article do
      ...
      form do |f|
        f.input :slug
        f.globalize_inputs :translations do |lf|
          lf.inputs do
            lf.input :title
            lf.input :body

            lf.input :locale, :as => :hidden
          end
        end

        f.buttons
      end
    end
  

That will tap into Active Admin’s internal Formtastic dependency and with the gem we added it will produce a screen like this:

By the way, sometimes people forget that in order for the Asset Pipeline to properly compile Active Admin’s assets in production, you have to declare them in the config/application.rb file like this:

    config.assets.precompile += %w(active_admin.js active_admin.css)
  

As a last tip, Active Admin interface itself is fully internationalizable. Read it’s documentation and you will find the YAML files that you can use to translate it to your native language.

I18n Routes

Last, but not least, for SEO purposes it is a good idea to have all or at least most of your URLs fully translated to your native language. For instance, we would want to have the following routes pointing all to the same actions:

    /users/sign_in
    /en/users/sign_in
    /pt-BR/usuarios/login
  

There are several gems that try to achieve this, but the best I found so far is rails-translate-routes. As usual, just add it to our Gemfile like this: gem 'rails-translate-routes' and run the bundle command. Then go edit your config/routes.rb file to looks like this:

    I18nDemo::Application.routes.draw do
      # rotas para active admin
      ActiveAdmin.routes(self)
      devise_for :admin_users, ActiveAdmin::Devise.config

      # rotas de autenticação do Devise
      devise_for :users

      # rotas pra artigos
      resources :articles

      # pagina principal
      get "welcome/index", as: "welcome"
      root to: 'welcome#index'
    end
  

We can translate just what we need, as an example let’s say that we want our Article routes and Devise’s routes to be translated but we don’t care for Active Admin’s routes. So we can organize the routes file like this:

    I18nDemo::Application.routes.draw do
      devise_for :users
      resources :articles
      get "welcome/index", as: "welcome"
      root to: 'welcome#index'
    end

    ActionDispatch::Routing::Translator.translate_from_file(
      'config/locales/routes.yml', {
        prefix_on_default_locale: true,
        keep_untranslated_routes: true })

    I18nDemo::Application.routes.draw do
      ActiveAdmin.routes(self)
      devise_for :admin_users, ActiveAdmin::Devise.config
    end
  

Where we put the translate_from_file defines the separation between what’s translated and what is not. Now it’s just a matter of creating a file named config/locales/routes.yml with the following translations:

    en:
      routes:
    pt-BR:
      routes:
        welcome: bemvindo
        new: novo
        edit: editar
        destroy: destruir
        password: senha
        sign_in: login
        users: usuarios
        cancel: cancelar
        article: artigo
        articles: artigos
  

The en.routes block is empty because – as I recommended in the beginning of the article – all our code is in English, so Rails will just pick the classes’ names and the entire app is in English by default. In the [your language].routes just make the translations for the words you want. After all that, when we run Rails’ rake routes task, we will have an output that looks like this:

    ...
    article_pt_br GET    /pt-BR/artigos/:id(.:format)    articles#show {:locale=>"pt-BR"}
       article_en GET    /en/articles/:id(.:format)      articles#show {:locale=>"en"}
                  GET    /articles/:id(.:format)         articles#show
                  PUT    /pt-BR/artigos/:id(.:format)    articles#update {:locale=>"pt-BR"}
                  PUT    /en/articles/:id(.:format)      articles#update {:locale=>"en"}
                  PUT    /articles/:id(.:format)         articles#update
                  DELETE /pt-BR/artigos/:id(.:format)    articles#destroy {:locale=>"pt-BR"}
                  DELETE /en/articles/:id(.:format)      articles#destroy {:locale=>"en"}
                  DELETE /articles/:id(.:format)         articles#destroy
    welcome_pt_br GET    /pt-BR/bemvindo/index(.:format) welcome#index {:locale=>"pt-BR"}
       welcome_en GET    /en/welcome/index(.:format)     welcome#index {:locale=>"en"}
                  GET    /welcome/index(.:format)        welcome#index
       root_pt_br        /pt-BR                          welcome#index {:locale=>"pt-BR"}
          root_en        /en                             welcome#index {:locale=>"en"}
    ...
  

Have you ever questioned yourself on the usage of named routes such as new_article_path in your view templates when you could just easily write “/articles/new”? Now you know why: the same named route will obey the internal I18n.locale and output the correct translated route. Pro tip: always try to adhere to the conventions instead of trying to be too smart, in this case, having being smart will cost you a lot of time to reconvert every hard-coded route as a named route.

We now need the application to be able to detect the locale options within the params hash, so let’s edit /app/controllers/application_controller.rb:

    class ApplicationController < ActionController::Base
      protect_from_forgery

      before_filter :set_locale
      before_filter :set_locale_from_url

      private

      def set_locale
        if lang = request.env['HTTP_ACCEPT_LANGUAGE']
          lang = lang[/^[a-z]{2}/]
          lang = :"pt-BR" if lang == "pt"
        end
        I18n.locale = params[:locale] || lang || I18n.default_locale
      end
    end
  

Now both http://localhost:3000/en/articles and http://localhost:3000/pt-BR/artigos will respond correctly. To create links in our pages to change the language, we can create a little helper to put in the view layout:

    module ApplicationHelper
      def language_links
        links = []
        I18n.available_locales.each do |locale|
          locale_key = "translation.#{locale}"
          if locale == I18n.locale
            links << link_to(I18n.t(locale_key), "#", class: "btn disabled")
          else
            links << link_to(I18n.t(locale_key), url_for(locale: locale.to_s), class: "btn")
          end
        end
        links.join("\n").html_safe
      end
      ...
    end
  

The url_for helper will create links that return to the current page in the browser, but with the translated route and proper locale parameter. Just add the helper somewhere in your layout view template:

    ...
    <div class="form-actions">
      <%= language_links %>
    </div>

    </body>
    </html>

  

This is the result you will see:

There are several different techniques to detect the language. You can make Rails understand subdomains, user's authenticated session, browser default language, cookies, but I prefer simple URI sections like the above examples show.

Conclusion

As you can see, there are several things we can add to our applications to make them fully international. But even if you're not planning to add multiple languages, it doesn't hurt to follow a few simple rules:

  • Make sure your database and source files are all using UTF8. It's very common to find applications running under Latin1 and having lot's of pain to reconvert everything to UTF8.
  • Having language specific text within your Ruby source code or view templates has to be considered a "code smell". Rails already makes all the heavy lifting, so just create a simple config/locales/en.yml to start.
  • Adding something as Globalize 3, on the other hand, may not be necessary unless you're sure you will need it. It's not difficult to add it later.
  • Do not use methods such as strftime or other methods that hard code the format of data conversions. Use I18n.localize for formatting.
  • And study more about Time zones and Rails support, you never know when you're gonna be bitten by time related issues.

There is a lot more you can tweak in your Rails application but this covers what you will face most commonly in your next multi-cultural world-wide application.

I hope you found this article useful. Feel free to ask questions and give feedback in the comments section of this post. Thanks!

Technorati Tags: , ,


(Powered by LaunchBit)

The Ongoing Vigil of Software Security

The Ongoing Vigil of Software Security

This guest post is by James Schorr, who has developed software since 1999. He is the owner of an IT consulting company, Enspiren IT Consulting, LLC.  He lives with his lovely wife, Tara, and their children in Kansas City, Missouri. James spends a lot of time writing code in many languages, doing IT security audits, and has a passion for Ruby on Rails in particular. He also loves playing chess and spending time with his family. His professional profile is on LinkedIn.

James M. Schorr TThe news is often filled with stories about exploits affecting large corporations and widely used software (LinkedIn, Yahoo, Windows, Linux, OS X, *BSD, Oracle, MySQL, Java, Flash, etc…). However, a tremendous amount of successful hacks and exploits take place on a daily basis on smaller-profile systems that we never hear about.

Some of the reasons that we keep seeing these types of exploits are that the “bad guys” are much smarter and more determined than we give them credit for, we’re much lazier and more ignorant than we take responsibility for, and security is difficult to manage properly. As we become more and more reliant upon software, it is imperative that security be taken more seriously.

What’s the big deal?

Consider this somewhat over-the-top thought exercise:

Think of your systems, databases, and code as a ship floating in the middle of the Atlantic. The ship was fairly hastily constructed as the management team pushed the various craftsmen to get done in time for the journey.

It’s the middle of the hurricane season. The waves are getting higher, sharks are circling your boat, and aboard are quite a few passengers. Most of the passengers are of a fairly decent ilk, but some are not. This latter group, partially due to the insufferable boredom that accompanies their long journey, have taken delight in drilling holes in the side of the boat (with the tools that were discarded during construction). Other troublemakers spend their time throwing chum overboard to the encircling sharks and even, when no one is watching, throwing each other overboard. A few of the cleverer sort spend their time impersonating the crew and using their new privileges to look for ways to take over the ship. Sadly, even some of the crew members have been persuaded into joining their mutinous ranks.

As time goes by, the remaining crew loses its ability to prevent damage to the craft and protect those on board, as a result of sheer exhaustion, the tenacity of the passengers, and the natural wear and tear of the elements.

What’s the point of this mental exercise? We need to realize that unrelenting attacks abound, both from within and without the system. If not properly addressed, they only escalate over time.

Security is a word that has a long, storied past. According to most dictionaries, one of the definitions of security includes, “free from danger”. Of course, stating that a system, code base, network, etc. is secure is quite naïve at best, dangerous at worst. Recognizing the threats is the first step toward positively addressing them.

Ask any IT team member that is charged with “securing” anything and you’ll quickly find out that it is an extremely difficult, often thankless task. Even in a tightly controlled environment, it can be pretty tough, especially during times of extreme change, turnover, growth, etc.

Why should we care?

We need to care because our applications, databases, and systems:

  • are regularly being threatened from the inside and the outside, often without us even being aware of it.
  • are depended upon by users who have invested some degree of their money, trust, time, or work into using them.
  • haven’t “arrived”. There is always a way to circumvent the “system”.
  • typically depend upon the “happy path” scenarios (e.g. when all goes well).

What can we do?

Thankfully, there are quite a few things that can be proactively done to help mitigate the risks and stave off the threats. For brevity’s sake, I’m going to give a high level overview of what can be done to help prevent exploits:

Team Security Measures

  1. Who should be in charge of our project’s security? Involve the right people, taking the time to get to know their character and mindset. Not everyone is cut out to think with the type of mindset needed to properly manage security. Unless someone is really into security, is trustworthy, is assertive, and unafraid of conflict, they simply aren’t the right person for this task.
  2. Who has need-to-know? Need-to-know is an essential principle in projects. Data leakage often inadvertently occurs by team members that probably didn’t need the information to begin with. Those that realize the “big-picture” usage of the data and need access to it for their tasks typically realize the need to keep the data private.
  3. Separation of duties with each area managed by a small core team. While not always possible, it is helpful to have one main realm of responsibility per team. Also, the core team of each area/realm needs to remain just that – the core team. In other words, the more people added, the tougher it is to keep things secure.
  4. How, when and to whom do we communicate? The procedures for securely communicating need-to-know information are critical to establish. Various methods need to be implemented to allow team members to exchange information in as secure a fashion as possible. An example might be the usage of an encrypted volume in a shared drive (retaining the control of the encryption details).
  5. Knowledge Transfer: when someone leaves the team, great care should be taken to transfer the knowledge to the new member in a secure fashion. Additionally, all relevant credentials should be changed immediately, no matter how trusted that individual or group was. A simple exit checklist – that is followed – can greatly help with this.

Technological Security Measures

  1. Testing is critical: we are testing, right? In dev-speak, tested_code != secure_code but tested_code.class == SecurityMindset. In other words, it is possible to write insecure, tested code, but proper testing does seem to inherit qualities from a security mindset and to encourage more thoughtful programming. In my opinion, testing generally falls into two main types:
    1. Code-based Testing: I’ll let others bore you with a long list of what’s available out there but do want to point out that real-world progress can be made towards better securing code with the usage of tools/methods such as: Rspec and friends, TDD, BDD, etc.
    2. Human Testing: sometimes nothing beats enlisting the help of others to pound away on our beloved projects. You’d be surprised at how many issues are found by this approach, often leading to cries of, “But users aren’t supposed to do that!”
      1. Non-technical users: enlist someone who can has a hard time finding the / key. This type of person will usually do all sorts of unexpected things. The unexpected behavior can quickly reveal the hidden weaknesses in the UI, workflow, and security.
      2. Enlist the upcoming geeks: you know those kids who are always jail-breaking phones? After issuing a few half-hearted reprimands, ask them to “conquer” your app. Offering a prize can’t hurt.
      3. Enlist an expert to audit your code, procedures, and projects.
  2. Logging:
    1. What to log: in general, the more information about transactional details (transactional referring to any actions that involve change), the better. Note that anything related to attempted security breaches needs to be logged. Admin alerts should also be automatically sent out; these alerts need to be designed with great care to not transmit anything that would harm the system if intercepted in transit.
    2. What to never, everlog:
      1. Credentials: passwords, API keys (abstract before logging: e.g. if Bob does X with an API key, put a different identifier in the log file, not the key).
      2. Credit card numbers, PINs, debit card numbers, anything banking related unless we are doing so in compliance with PCI standards.
      3. Medical information (see HIPPA – Health Insurance Portability and Accountability Act or your country’s corresponding laws).
      4. Anything that can be used to compromise the systems or it’s users.
    3. How to log: I personally prefer a two-pronged approach: 1) writing to log files which are automatically transferred offsite, 2) an audit trail via a NoSQL database (using a fire-and-forget type of approach; in other words send the insert but keep on moving, a failure to log to the audit trail should alert admins but not slow down or impact the user’s use of the application at all).
    4. When to log: as close to the event as possible, to minimize the chance of data loss.
      1. Log Alterability: Think, “if I was a hacker and compromised this system, I’d want to clean up after my activities”. How do I make my logs non-alterable, even by support staff?
  3. Access Levels: these typically fall into the following:
    1. Users
      1. What can they access and why
      2. Who can change their level (e.g. can the user manage their own level via subscriptions)?
    2. Support Staff
      1. Level 1 CSR
      2. Level 2 CSR
      3. Level 3 Admins
      4. Dictators (can do anything with no recourse)… careful with these types.
  4. Crucial Elements:
    1. Account Lockouts
      1. Users are locked out for some period of time when they fail to login after X attempts or try via different IPs, etc.
      2. Users are locked out and admins alerted when they try to get around the system (these types of lockouts do not expire with time but rather require a Support Staff person to unlock them based on their discretion).
      3. Ability for Support Staff to lock and unlock users very quickly after following a procedure to record why they’re doing so. A permanent record needs to be kept as to who unlocked whom and why.
    2. Account Password Policies: password strength, requirements to change the password every X days, password history (can’t reuse old passwords), etc.
    3. Other: click-limits, IP address binding, geographic-binding, usage of Oauth 2, etc.
  5. Frameworks and Software Libraries: it’s fairly common to have security vulnerabilities “appear” due to the integration of code from other sources. Of course, no one has time to re-invent the wheel, so to speak; nor should they. It is a good practice to always read through the source code and reported issues of 3rd party software prior to implementation.
    1. Take the time to search for some of its common exploits and best-practice methods of usage. Have we taken the time to test what X library (framework, gem, plugin, etc.) would mean for our application’s speed, stability, and security?
    2. Refrain from handling some things ourselves. A good example is credit-card processing. Why handle it yourself when a 3rd party, tested service will likely do so in a more secure manner? Look for a project that has been around for a while and has a good track-record of quickly closing vulnerabilities.
  6. Servers and Hosting: it may save some money to host on a shared host or cousin Bill’s server, but will the data be secure? It’s best to strive for meeting all three of the CIA principles (Confidentiality, Integrity, and Availability) when choosing a host, striving for at least a medium-level for each principal.
    1. Keep the servers up-to-date.
    2. Use intrusion detection applications (e.g. psad, fwsnort) to alert admins of attempts to break in the system.
    3. Use a properly configured firewall that is easy to adjust quickly.
    4. Send the logs offsite (e.g. not on the same “box”) to a secured server on a frequent basis.
    5. Backups: ideally, these should occur nightly of the entire codebase, logs, and database dumps; these backups should be kept offsite in the same manner as logs.
    6. Imaging: frequent images of servers can be helpful for forensics in the event of an exploit and for data recovery.
    7. Server-side miscellaneous applications (Apache, Nginx, SSH, OpenSSL, etc.): disable unused modules, limit connections, use non-default ports, etc. (see Resources for more ideas).
    8. Schedule checks for rootkits and malware on a daily basis; be sure to alert admins if any is found.
  7. Database(s): Familiarity with the database(s) is key to keeping them secure. For instance, if a development team is very familiar with MySQL and decides to add in a secondary technology alongside (maybe some MongoDB databases), it would be wisest to evaluate the architecture and security implications prior to implementation.
  8. Credentials:
    1. Where and how should we store the credentials that our app needs (e.g. api keys, database credentials, etc.)? A good thing to ask ourselves is, “if someone did get into our server (as non-root, as if they did as root, it’s game-over anyhow), what could they get and who would it hurt?”
    2. Are we deploying our credentials to GitHub or other VCS? If so, we’re blindly trusting that 3rd party to be and stay secure.
    3. Changes should be planned for and completed whenever there is a change in personnel and on a periodic basis. This can become a real hassle unless thought is given along the lines of, “How do we quickly change these credentials?”

I hope that this article has given you at least a few ideas of how to better improve your software project’s security. If so, I’ll consider it a success. Feel free to ask questions and give feedback in the comments section of this post. Thanks!

Resources

Below are some resources that may be helpful (those that I have found extremely helpful over the years are denoted with a * next to them):

Technorati Tags: ,


(Powered by LaunchBit)

Deploying Rails: Automate, Deploy, Scale, Maintain, and Sleep at Night

Deploying Rails: Automate, Deploy, Scale, Maintain, and Sleep at Night now in print; Send to Readmill

#366 Sidekiq

Sidekiq allows you to move jobs into the background for asynchronous processing. It uses threads instead of forks so it is much more efficient with memory compared to Resque.

Buildpacks: Heroku for Everything

Last summer, Heroku became a polyglot platform, with official support for Ruby, Node.js, Clojure, Java, Python, and Scala. Building a platform that works equally well for such a wide variety of programming languages was a unique technical design challenge.

siloed products would be a non-scalable design

We knew from the outset that maintaining siloed, language-specific products – a Heroku for Ruby, a Heroku for Node.js, a Heroku for Clojure, and so on – wouldn’t be scalable over the long-term.

Instead, we created Cedar: a single, general-purpose stack with no native support for any language. Adding support for any language is a matter of layering on a build-time adapter that can compile an app written in a particular language or framework into an executable that can run on the universal runtime provided by Cedar. We call this adapter a buildpack.

build-time language adapters support a single runtime stack

An Example: Ruby on Rails

If you’ve deployed any app to the Cedar stack, then you’ve already used at least one buildpack, since the buildpack is what executes during git push heroku master. Let’s explore the Ruby buildpack by looking at the terminal output that results when deploying a Rails 3.2 app:

$ git push heroku master
Counting objects: 67, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (53/53), done.
Writing objects: 100% (67/67), 26.33 KiB, done.
Total 67 (delta 5), reused 0 (delta 0)

-----> Heroku receiving push
-----> Ruby/Rails app detected
-----> Installing dependencies using Bundler version 1.2.0.pre
       Running: bundle install --without development:test --path vendor/bundle --binstubs bin/ --deployment
       Fetching gem metadata from https://rubygems.org/.......
       Installing rake (0.9.2.2)
       ...
       Your bundle is complete! It was installed into ./vendor/bundle
-----> Writing config/database.yml to read from DATABASE_URL
-----> Preparing app for Rails asset pipeline
       Running: rake assets:precompile
       Asset precompilation completed (16.16s)
-----> Rails plugin injection
       Injecting rails_log_stdout
       Injecting rails3_serve_static_assets
-----> Discovering process types
       Procfile declares types      -> (none)
       Default types for Ruby/Rails -> console, rake, web, worker
-----> Compiled slug size is 9.6MB
-----> Launching... done, v4
       http://chutoriaru.herokuapp.com deployed to Heroku

To git@heroku.com:chutoriaru.git
 * [new branch]      master -> master

Everything that happens between Heroku receiving push and Compiled slug size is 9.6MB is part of the buildpack. In order:

The slug that results from this Rails-specific build process can now be booted on our language-agnostic dyno manifold alongside Python, Java, and many other types of applications.

Using a Custom Buildpack

In the example above, the appropriate buildpack was automatically detected from our list of Heroku-maintained defaults.

However, you can also specify your desired buildpack using arguments to the heroku create command or by setting the BUILDPACK_URL config variable. This enables the use of custom buildpacks. If you want to run your Rails app on JRuby, for example, specify the buildpack created by the JRuby team at app creation time:

$ heroku create --buildpack https://github.com/jruby/heroku-buildpack-jruby

Arbitrary Language Support

Since language support can be completely contained inside a buildpack, it is possible to deploy an app written in nearly any language to Heroku. Indeed, there are a variety of third-party buildpacks already available:

See the full list of third party buildpacks in the Dev Center.

Customizing the Build Process

In addition to enabling new language support, the ability to select a buildpack allows you to modify the previously closed Heroku build process for popular languages.

For example, consider a Ruby app that needs to generate static files using Jekyll. Before buildpacks, the only solutions would have been to 1) generate the files before deployment and commit them to the repository or 2) generate the files on-the-fly at runtime. Neither of these solutions are ideal as they violate the strict separation that should be maintained between the codebase, the build stage, and the run stage.

By forking the official Ruby buildpack, you could add a site generation step to your build process, putting file generation in the build stage where it belongs.

All of the default buildpacks are open source, available for you to inspect, and fork to modify for your own purposes. And if you make a change that you think would be useful to others, please submit an upstream pull request!

Adding Binary Support

Your app might depend on binaries such as language VMs or extensions that are not present in the default runtime. If this is the case, these dependencies can be packaged into the buildpack. A good example is this fork of the default Ruby buildpack which adds library support for the couchbase gem. Vulcan is a tool to help you build binaries compatible with the 64-bit Linux architecture which dynos run on.

Buildpacks Beyond Heroku

Buildpacks are potentially useful in any environment, and we’d love to see their usage spread beyond the Heroku platform. Minimizing lock-in and maximizing transparency is an ongoing goal for Heroku.

Using buildpacks can be a convenient way to leverage existing, open-source code to add new language and framework support to your own platform. Stackato, a platform-as-a-service by ActiveState, recently announced support for Heroku buildpacks.

You can also run buildpacks on your local workstation or in a traditional server-based environment with Mason.

Conclusion

Get started hacking buildpacks today by forking the Hello Buildpack! Read up on the implementation specifics laid out in the Buildpack API documentation, and join the public Buildpacks Google Group. If you make a buildpack you think would be useful and that you intend to maintain, send us an email at buildpacks@heroku.com and for potential inclusion on the third-party buildpacks page.

Heroku Postgres Basic Plan and Row Limits

Today, the Heroku Postgres team released into beta the new basic plan, $9 / month version of the free dev plan.

Accompanying this announcement is the implementation of a 10,000 row limit on the dev plan. This row limit was designed to correspond to the 5mb limit on the existing free shared plan.

Please note that these plans are still beta, and Heroku Postgres has not yet announced a migration schedule from the shared plan. However you can start using these plans today.

Read more about the new plan, and the mechanics of the row limits on the Heroku Postgres Blog.

Rails, Objects, Tests, and Other Useful Things

For the first time in quite a while, I’ve been able to spend time working on a brand-new Rails application that’s actually a business thing and not a side project. It’s small. Okay, it’s really small. But at least for the moment it’s mine, mine, mine. (What was that about collective code ownership? I can’t hear you…)

This seemed like a good time to reflect on the various Object-Oriented Rails discussions, including Avdi’s book, DCI in Rails, fast Rails tests, Objectify, DHH’s skepticism about the whole enterprise, and even my little contribution to the debate. And while we’re at it, we can throw in things like Steve Klabnik’s post on test design and Factory Girl

I’m not sure I have any wildly bold conclusion to make here, but a few things struck me as I went through my newest coding enterprise with all this stuff rattling around in my head.

A little background — I’ve actually done quite a bit of more formal Object-Oriented stuff, though it’s more academic than corporate enterprise. My grad research involved teaching object-oriented design, so I was pretty heavily immersed in the OO documents circa the mid-to-late 90s. So, it’s not like I woke up last May and suddenly realized that objects existed. That said, I’m as guilty as any Rails programmer at taking advantage of the framework’s ability to write big balls of mud.

Much of this discussion is effectively about how to manage complexity in an application. The thing about complexity, while you can always create complexity in your system, you can’t always remove it. At some point, your code has to do what it has to do, and that’s a minimum on how complex your system is. You can move the complexity around, and you can arguably make it easier to deal with. But… to some extent “easier to deal with” is subjective, and all these techniques have trade-offs. Smaller classes means more classes, adding structure to make dependencies flexible often increases immediate cost. Adding abstraction simplifies individual parts of the system at the cost of making it harder to reason about the system as a whole. There are some sweet spots, I think, but a lot of this is a question of picking the Kool-Aid flavor you like best.

Personally, I like to start with simple and evolve to complex. That means small methods, small classes, and limited interaction between classes. In other words, I’m willing to accept a little bit of structural overhead in order to keep each individual piece of the code simple. Then the idea is to refactor aggressively, making techniques like DCI more something I use as a tool when I see complexity then a place I start from. Premature abstraction is in the same realm as premature optimization. (In particular, I find a lot of forms of Dependency Injection really don’t fit in my head, it takes a lot for me to feel like that flavor of flexible dependencies are the solution to my problem.)

I can never remember where I saw this, but it was an early XP maxim that you should try to keep the simplicity 90% of your system that was simple so that you had the maximum resources to bear on the 10% that is really hard.

To make this style work, you need good tests and you need fast tests — TDD is a critical part of building code this way. You need to be confident that you can refactor, and you need to be able to refactor in small steps and rerun tests. That’s why, while I think I get what Gregory Moeck is saying here, I can’t agree with his conclusion. I think “more testable” is just as valid an engineering goal as “fast” or “uses minimal memory”. I think if your abstraction doesn’t allow you to test, then you have the wrong abstraction. (Though I still think the example he uses is over built…).

Fast tests are most valuable as a means to an end, with the end being understandable and easily changeable code. Fast tests help you get to that end because you can run them more often, ideally you can run them fast enough so that you don’t break focus going back and forth between tests and code, the transition is supposed to be seamless. Also, an inability to write fast tests easily often means that there’s a flaw in your design. Specifically, it means that there’s too much interaction between multiple parts of your program, such that it’s impossible to test a single part in isolation.

One of the reasons that TDD works is that the tests become kind of a universal client of your code, forcing your code to have a lot of surface area, so to speak, and not a lot of hidden depth or interactions. Again, this is valuable because code without hidden depth is easier to understand and easier to change. If writing tests becomes hard or slow, the tests are trying to tell you that your code is building up interior space where logic is hiding — you need to break the code apart to expose the logic to a unit test.

The metric that matters here is how easily you can change you code. A quick guide to this is what kinds of bugs you get. A well-built system won’t necessarily have fewer bugs, but will have shallower bugs that take less time to fix.

Isolation helps, the Single Responsibility Principle helps. Both of these are good rules of thumb in keeping the simple parts of your code simple. But it also helps to understand that “single responsibility” is also a matter of perspective. (I like the guideline in GOOS that you should be able to describe what a class does without using “and” or “or”.

Another good rule of thumb is that objects that are always used together should be split out into their own abstraction. Or, from the other direction, data that changes on different time scales should be in different abstractions.

In Rails, remember that “models” is not the same as “ActiveRecord models”. Business logic that does not depend on persistence is best kept in classes that aren’t also managing persistence. Fast tests are one side effect here, but keeping classes focused has other benefits in terms of making the code easier to understand and easier to change.

Actual minor Rails example — pulling logic related to start and end dates into a DateRange class. (Actually, in building this, I started with the code in the actual model, then refactored to a HasDateRange service module that was mixed in to the ActiveRecord model, then refactored to a DateRange class when it became clear that a single model might need multiple date ranges. The DateRange class can be reused, and that’s great, but the reuse is a side-effect of the isolation. The main effect is that it’s easier to understand where the date range logic is.

I’ve been finding myself doing similar things with Rails associations, pulling methods related to the list of associated objects into a HasThings style module, then refactoring to a ThingCollection class.

You need to be vigilant to abstractions showing up in your code. Passing arguments, especially if you are passing the same argument sets to multiple methods, often means there’s a class waiting to be born. Using a lot of If logic or case logic often means there’s a set of objects that have polymorphic behavior — especially if you are using the same logical test multiple times. Passing around nil often means you are doing something sub-optimally.

Another semi-practical Rails example: I have no problem with an ActiveRecord model having class methods that create new objects of that model as long as the methods are simple. As soon as the methods get complex, I’ve been pulling them into a factory class, where they become instance methods. (I always have the factory be a class that is instantiated rather than having it be a set of class methods or a singleton — I find the code breaks much more cleanly as regular instance methods.) At that point, you can usually break the complicated factory method into a bunch of smaller methods with semantically meaningful names. These classes wind up being very similar to a DCI context class.

Which reminds me — if you are wondering whether the Extract Method refactoring is needed in a particular case, the answer is yes. Move the code to a method with a semantically meaningful name. Somebody will be thankful for it, probably you in a month.

Some of this is genuinely subjective — I never in a million years would have generated this solution — I’d be more likely to have a Null Object for Post if this started to bother me, because event systems don’t seem like simplifications to me.

I do worry how this kind of aggressive refactoring style, or any kind of structured style, plays out in a large team or even just a team with varying degrees of skill, or even just a team where people have different styles. It’s hard to aggressively refactor when three-dozen coders are dependent on something (though, granted, if you’ve isolated well you have a better shot). And it’s hard to overstate the damage that one team member who isn’t down with the program can do to your exquisite object model. I don’t have an answer to this, and I think it’s a really complicated problem.

You don’t know the future. Speculation about reuse gains and maintenance costs are just speculation. Reuse and maintenance are the side effect of good coding practices, but trying to build them in explicitly by starting with complexity is has the same problems as any up-front design, namely that you are making the most important decisions about your system at the point when you know the least about the system. The TDD process can help you here.

Rails, Objects, Tests, and Other Useful Things

For the first time in quite a while, I’ve been able to spend time working on a brand-new Rails application that’s actually a business thing and not a side project. It’s small. Okay, it’s really small. But at least for the moment it’s mine, mine, mine. (What was that about collective code ownership? I can’t hear you…)

This seemed like a good time to reflect on the various Object-Oriented Rails discussions, including Avdi’s book, DCI in Rails, fast Rails tests, Objectify, DHH’s skepticism about the whole enterprise, and even my little contribution to the debate. And while we’re at it, we can throw in things like Steve Klabnik’s post on test design and Factory Girl

I’m not sure I have any wildly bold conclusion to make here, but a few things struck me as I went through my newest coding enterprise with all this stuff rattling around in my head.

A little background — I’ve actually done quite a bit of more formal Object-Oriented stuff, though it’s more academic than corporate enterprise. My grad research involved teaching object-oriented design, so I was pretty heavily immersed in the OO documents circa the mid-to-late 90s. So, it’s not like I woke up last May and suddenly realized that objects existed. That said, I’m as guilty as any Rails programmer at taking advantage of the framework’s ability to write big balls of mud.

Much of this discussion is effectively about how to manage complexity in an application. The thing about complexity, while you can always create complexity in your system, you can’t always remove it. At some point, your code has to do what it has to do, and that’s a minimum on how complex your system is. You can move the complexity around, and you can arguably make it easier to deal with. But… to some extent “easier to deal with” is subjective, and all these techniques have trade-offs. Smaller classes means more classes, adding structure to make dependencies flexible often increases immediate cost. Adding abstraction simplifies individual parts of the system at the cost of making it harder to reason about the system as a whole. There are some sweet spots, I think, but a lot of this is a question of picking the Kool-Aid flavor you like best.

Personally, I like to start with simple and evolve to complex. %hat means small methods, small classes, and limited interaction between classes. In other words, I’m willing to accept a little bit of structural overhead in order to keep each individual piece of the code simple. Then the idea is to refactor aggressively, making techniques like DCI more something I use as a tool when I see complexity then a place I start from. Premature abstraction is in the same realm as premature optimization. (In particular, I find a lot of forms of Dependency Injection really don’t fit in my head, it takes a lot for me to feel like that flavor of flexible dependencies are the solution to my problem.)

I can never remember where I saw this, but it was an early XP maxim that you should try to keep the simplicity 90% of your system that was simple so that you had the maximum resources to bear on the 10% that is really hard.

To make this style work, you need good tests and you need fast tests — TDD is a critical part of building code this way. You need to be confident that you can refactor, and you need to be able to refactor in small steps and rerun tests. That’s why, while I think I get what Gregory Moeck is saying here, I can’t agree with his conclusion. I think “more testable” is just as valid an engineering goal as “fast” or “uses minimal memory”. I think if your abstraction doesn’t allow you to test, then you have the wrong abstraction. (Though I still think the example he uses is over built…).

Fast tests are most valuable as a means to an end, with the end being understandable and easily changeable code. Fast tests help you get to that end because you can run them more often, ideally you can run them fast enough so that you don’t break focus going back and forth between tests and code, the transition is supposed to be seamless. Also, an inability to write fast tests easily often means that there’s a flaw in your design. Specifically, it means that there’s too much interaction between multiple parts of your program, such that it’s impossible to test a single part in isolation.

One of the reasons that TDD works is that the tests become kind of a universal client of your code, forcing your code to have a lot of surface area, so to speak, and not a lot of hidden depth or interactions. Again, this is valuable because code without hidden depth is easier to understand and easier to change. If writing tests becomes hard or slow, the tests are trying to tell you that your code is building up interior space where logic is hiding — you need to break the code apart to expose the logic to a unit test.

The metric that matters here is how easily you can change you code. A quick guide to this is what kinds of bugs you get. A well-built system won’t necessarily have fewer bugs, but will have shallower bugs that take less time to fix.

Isolation helps, the Single Responsibility Principle helps. Both of these are good rules of thumb in keeping the simple parts of your code simple. But it also helps to understand that “single responsibility” is also a matter of perspective. (I like the guideline in GOOS that you should be able to describe what a class does without using “and” or “or”.

Another good rule of thumb is that objects that are always used together should be split out into their own abstraction. Or, from the other direction, data that changes on different time scales should be in different abstractions.

In Rails, remember that “models” is not the same as “ActiveRecord models”. Business logic that does not depend on persistence is best kept in classes that aren’t also managing persistence. Fast tests are one side effect here, but keeping classes focused has other benefits in terms of making the code easier to understand and easier to change.

Actual minor Rails example — pulling logic related to start and end dates into a DateRange class. (Actually, in building this, I started with the code in the actual model, then refactored to a HasDateRange service module that was mixed in to the ActiveRecord model, then refactored to a DateRange class when it became clear that a single model might need multiple date ranges. The DateRange class can be reused, and that’s great, but the reuse is a side-effect of the isolation. The main effect is that it’s easier to understand where the date range logic is.

I’ve been finding myself doing similar things with Rails associations, pulling methods related to the list of associated objects into a HasThings style module, then refactoring to a ThingCollection class.

You need to be vigilant to abstractions showing up in your code. Passing arguments, especially if you are passing the same argument sets to multiple methods, often means there’s a class waiting to be born. Using a lot of If logic or case logic often means there’s a set of objects that have polymorphic behavior — especially if you are using the same logical test multiple times. Passing around nil often means you are doing something sub-optimally.

Another semi-practical Rails example: I have no problem with an ActiveRecord model having class methods that create new objects of that model as long as the methods are simple. As soon as the methods get complex, I’ve been pulling them into a factory class, where they become instance methods. (I always have the factory be a class that is instantiated rather than having it be a set of class methods or a singleton — I find the code breaks much more cleanly as regular instance methods.) At that point, you can usually break the complicated factory method into a bunch of smaller methods with semantically meaningful names. These classes wind up being very similar to a DCI context class.

Which reminds me — if you are wondering whether the Extract Method refactoring is needed in a particular case, the answer is yes. Move the code to a method with a semantically meaningful name. Somebody will be thankful for it, probably you in a month.

Some of this is genuinely subjective — I never in a million years would have generated this solution — I’d be more likely to have a Null Object for Post if this started to bother me, because event systems don’t seem like simplifications to me.

I do worry how this kind of aggressive refactoring style, or any kind of structured style, plays out in a large team or even just a team with varying degrees of skill, or even just a team where people have different styles. It’s hard to aggressively refactor when three-dozen coders are dependent on something (though, granted, if you’ve isolated well you have a better shot). And it’s hard to overstate the damage that one team member who isn’t down with the program can do to your exquisite object model. I don’t have an answer to this, and I think it’s a really complicated problem.

You don’t know the future. Speculation about reuse gains and maintenance costs are just speculation. Reuse and maintenance are the side effect of good coding practices, but trying to build them in explicitly by starting with complexity is has the same problems as any up-front design, namely that you are making the most important decisions about your system at the point when you know the least about the system. The TDD process can help you here.

Rails, Objects, Tests, and Other Useful Things

For the first time in quite a while, I’ve been able to spend time working on a brand-new Rails application that’s actually a business thing and not a side project. It’s small. Okay, it’s really small. But at least for the moment it’s mine, mine, mine. (What was that about collective code ownership? I can’t hear you…)

This seemed like a good time to reflect on the various Object-Oriented Rails discussions, including Avdi’s book, DCI in Rails, fast Rails tests, Objectify, DHH’s skepticism about the whole enterprise, and even my little contribution to the debate. And while we’re at it, we can throw in things like Steve Klabnik’s post on test design and Factory Girl

I’m not sure I have any wildly bold conclusion to make here, but a few things struck me as I went through my newest coding enterprise with all this stuff rattling around in my head.

A little background — I’ve actually done quite a bit of more formal Object-Oriented stuff, though it’s more academic than corporate enterprise. My grad research involved teaching object-oriented design, so I was pretty heavily immersed in the OO documents circa the mid-to-late 90s. So, it’s not like I woke up last May and suddenly realized that objects existed. That said, I’m as guilty as any Rails programmer at taking advantage of the framework’s ability to write big balls of mud.

Much of this discussion is effectively about how to manage complexity in an application. The thing about complexity, while you can always create complexity in your system, you can’t always remove it. At some point, your code has to do what it has to do, and that’s a minimum on how complex your system is. You can move the complexity around, and you can arguably make it easier to deal with. But… to some extent “easier to deal with” is subjective, and all these techniques have trade-offs. Smaller classes means more classes, adding structure to make dependencies flexible often increases immediate cost. Adding abstraction simplifies individual parts of the system at the cost of making it harder to reason about the system as a whole. There are some sweet spots, I think, but a lot of this is a question of picking the Kool-Aid flavor you like best.

Personally, I like to start with simple and evolve to complex. %hat means small methods, small classes, and limited interaction between classes. In other words, I’m willing to accept a little bit of structural overhead in order to keep each individual piece of the code simple. Then the idea is to refactor aggressively, making techniques like DCI more something I use as a tool when I see complexity then a place I start from. Premature abstraction is in the same realm as premature optimization. (In particular, I find a lot of forms of Dependency Injection really don’t fit in my head, it takes a lot for me to feel like that flavor of flexible dependencies are the solution to my problem.)

I can never remember where I saw this, but it was an early XP maxim that you should try to keep the simplicity 90% of your system that was simple so that you had the maximum resources to bear on the 10% that is really hard.

To make this style work, you need good tests and you need fast tests — TDD is a critical part of building code this way. You need to be confident that you can refactor, and you need to be able to refactor in small steps and rerun tests. That’s why, while I think I get what Gregory Moeck is saying here, I can’t agree with his conclusion. I think “more testable” is just as valid an engineering goal as “fast” or “uses minimal memory”. I think if your abstraction doesn’t allow you to test, then you have the wrong abstraction. (Though I still think the example he uses is over built…).

Fast tests are most valuable as a means to an end, with the end being understandable and easily changeable code. Fast tests help you get to that end because you can run them more often, ideally you can run them fast enough so that you don’t break focus going back and forth between tests and code, the transition is supposed to be seamless. Also, an inability to write fast tests easily often means that there’s a flaw in your design. Specifically, it means that there’s too much interaction between multiple parts of your program, such that it’s impossible to test a single part in isolation.

One of the reasons that TDD works is that the tests become kind of a universal client of your code, forcing your code to have a lot of surface area, so to speak, and not a lot of hidden depth or interactions. Again, this is valuable because code without hidden depth is easier to understand and easier to change. If writing tests becomes hard or slow, the tests are trying to tell you that your code is building up interior space where logic is hiding — you need to break the code apart to expose the logic to a unit test.

The metric that matters here is how easily you can change you code. A quick guide to this is what kinds of bugs you get. A well-built system won’t necessarily have fewer bugs, but will have shallower bugs that take less time to fix.

Isolation helps, the Single Responsibility Principle helps. Both of these are good rules of thumb in keeping the simple parts of your code simple. But it also helps to understand that “single responsibility” is also a matter of perspective. (I like the guideline in GOOS that you should be able to describe what a class does without using “and” or “or”.

Another good rule of thumb is that objects that are always used together should be split out into their own abstraction. Or, from the other direction, data that changes on different time scales should be in different abstractions.

In Rails, remember that “models” is not the same as “ActiveRecord models”. Business logic that does not depend on persistence is best kept in classes that aren’t also managing persistence. Fast tests are one side effect here, but keeping classes focused has other benefits in terms of making the code easier to understand and easier to change.

Actual minor Rails example — pulling logic related to start and end dates into a DateRange class. (Actually, in building this, I started with the code in the actual model, then refactored to a HasDateRange service module that was mixed in to the ActiveRecord model, then refactored to a DateRange class when it became clear that a single model might need multiple date ranges. The DateRange class can be reused, and that’s great, but the reuse is a side-effect of the isolation. The main effect is that it’s easier to understand where the date range logic is.

I’ve been finding myself doing similar things with Rails associations, pulling methods related to the list of associated objects into a HasThings style module, then refactoring to a ThingCollection class.

You need to be vigilant to abstractions showing up in your code. Passing arguments, especially if you are passing the same argument sets to multiple methods, often means there’s a class waiting to be born. Using a lot of If logic or case logic often means there’s a set of objects that have polymorphic behavior — especially if you are using the same logical test multiple times. Passing around nil often means you are doing something sub-optimally.

Another semi-practical Rails example: I have no problem with an ActiveRecord model having class methods that create new objects of that model as long as the methods are simple. As soon as the methods get complex, I’ve been pulling them into a factory class, where they become instance methods. (I always have the factory be a class that is instantiated rather than having it be a set of class methods or a singleton — I find the code breaks much more cleanly as regular instance methods.) At that point, you can usually break the complicated factory method into a bunch of smaller methods with semantically meaningful names. These classes wind up being very similar to a DCI context class.

Which reminds me — if you are wondering whether the Extract Method refactoring is needed in a particular case, the answer is yes. Move the code to a method with a semantically meaningful name. Somebody will be thankful for it, probably you in a month.

Some of this is genuinely subjective — I never in a million years would have generated this solution — I’d be more likely to have a Null Object for Post if this started to bother me, because event systems don’t seem like simplifications to me.

I do worry how this kind of aggressive refactoring style, or any kind of structured style, plays out in a large team or even just a team with varying degrees of skill, or even just a team where people have different styles. It’s hard to aggressively refactor when three-dozen coders are dependent on something (though, granted, if you’ve isolated well you have a better shot). And it’s hard to overstate the damage that one team member who isn’t down with the program can do to your exquisite object model. I don’t have an answer to this, and I think it’s a really complicated problem.

You don’t know the future. Speculation about reuse gains and maintenance costs are just speculation. Reuse and maintenance are the side effect of good coding practices, but trying to build them in explicitly by starting with complexity is has the same problems as any up-front design, namely that you are making the most important decisions about your system at the point when you know the least about the system. The TDD process can help you here.

Ruby Programming 35th Batch: Registrations now open

Registrations are now open for RubyLearning’s popular Ruby programming course. This is an intensive, online course for beginners that helps you get started with Ruby programming.

Here is what Demetris Demetriou, a participant who just graduated, has to say – “When I joined this course I was sceptical about how useful this course would be for me instead of reading material and watching videos on YouTube and thus saving money. After the course started I realised how valuable this course was. In the past I had read many Ruby books over and over, but never got into really getting practical with it and never had confidence in it. Lots of theory but couldn’t use it. I feel that the exercises in this course and the support, monitoring from our mentor Victor, made the huge difference that all books in the past didn’t. It wasn’t about reading lots of books, but simply few things and get practical and understand them well. I feel I learnt a lot and I’m coming back for more to rubylearning.org Thanks a lot Victor and Satish and all the other Rubyists who gave us today’s Ruby.”

What’s Ruby?

Ruby

According to http://www.ruby-lang.org/en/ – “Ruby is a dynamic, open source programming language with a focus on simplicity and productivity. Ruby’s elegant syntax is natural to read and easy to write.”

Yukihiro Matsumoto, the creator of Ruby, in an interview says –

I believe people want to express themselves when they program. They don’t want to fight with the language. Programming languages must feel natural to programmers. I tried to make people enjoy programming and concentrate on the fun and creative part of programming when they use Ruby.

What Will I Learn?

In the Ruby programming course, you will learn the essential features of Ruby that you will end up using every day. You will also be introduced to Git, GitHub, HTTP concepts, RubyGems, Rack and Heroku.

Depending on participation levels, we throw a Ruby coding challenge in the mix, right for the level we are at. We have been known to give out a prize or two for the ‘best’ solution.

Who’s It For?

A beginner with some knowledge of programming..

You can read what past participants have to say about the course.

Mentors

Satish Talim, Michael Kohl, Satoshi Asakawa, Victor Goff III and others from the RubyLearning team.

Dates

The course starts on Saturday, 18th Aug. 2012 and runs for seven weeks.

RubyLearning’s IRC Channel

Most of the mentors and students hang out at RubyLearning’s IRC (irc.freenode.net) channel (#rubylearning.org) for both technical and non-technical discussions. Everyone benefits with the active discussions on Ruby with the mentors.

How do I register and pay the course fees?

  • The course is based on the The Ultimate Guide to Ruby Programming eBook. This book is normally priced at US$ 19.95 and we are discounting it US$ 10.00 by combining it in the Course+eBook option below.
  • You can pay either by Paypal or send cash via Western Union Money Transfer or by bank transfer (if you are in India). The fees collected helps RubyLearning maintain the site, this Ruby course, the Ruby eBook, and provide quality content to you.
  • Once you pay the fees below, register on the RubyLearning.org site and send us your name and registered email id while creating an account at RubyLearning.org to satish [at] rubylearning [dot] com We will enrol you into the course within 48 hours.
  • We will enroll you into the course. If you have purchased the eBook at the time of registration, we will personally email you the eBook within 48 hours.

You can pay the Course Fees by selecting one of the three options from the drop-down menu below. Please select your option and then click on the “Add to Cart” button.

Register

At the end of this course you should have all the knowledge to explore the wonderful world of Ruby on your own.

Here are some details on how the course works:

Important:

Once the course starts, you can login and start with the lessons any day and time and post your queries in the forum under the relevant lessons. Someone shall always be there to answer them. Just to set the expectations correctly, there is no real-time ‘webcasting’.

Methodology:

  • The Mentors shall give you URL’s of pages and sometimes some extra notes; you need to read through. Read the pre-class reading material at a convenient time of your choice – the dates mentioned are just for your guideline. While reading, please make a note of all your doubts, queries, questions, clarifications, comments about the lesson and after you have completed all the pages, post these on the forum under the relevant lesson. There could be some questions that relate to something that has not been mentioned or discussed by the mentors thus far; you could post the same too. Please remember that with every post, do mention the operating system of your computer.
  • The mentor shall highlight the important points that you need to remember for that day’s session.
  • There could be exercises every day. Please do them.
  • Participate in the forum for asking and answering questions or starting discussions. Share knowledge, and exchange ideas among yourselves during the course period. Participants are strongly encouraged to post technical questions, interesting articles, tools, sample programs or anything that is relevant to the class / lesson. Please do not post a simple "Thank you" note or "Hello" message to the forum. Please be aware that these messages are considered noises by people subscribed to the forum.

Outline of Work Expectations:

  1. Most of the days, you will have exercises to solve. These are there to help you assimilate whatever you have learned till then.
  2. Some days may have some extra assignments / food for thought articles / programs
  3. Above all, do take part in the relevant forums. Past participants will confirm that they learned the best by active participation.

Some Commonly Asked Questions

  • Qs. Is there any specific time when I need to be online?
    Ans. No. You need not be online at a specific time of the day.
  • Qs. Is it important for me to take part in the course forums?
    Ans. YES. You must Participate in the forum(s) for asking and answering questions or starting discussions. Share knowledge, and exchange ideas among yourselves (participants) during the course period. Participants are strongly encouraged to post technical questions, interesting articles, tools, sample programs or anything that is relevant to the class / lesson. Past participants will confirm that they learned the best by active participation.
  • Qs. How much time do I need to spend online for a course, in a day?
    Ans. This will vary from person to person. All depends upon your comfort level and the amount of time you want to spend on a particular lesson or task.
  • Qs. Is there any specific set time for feedback (e.g., any mentor responds to me within 24 hours?)
    Ans. Normally somebody should answer your query / question within 24 hours.
  • Qs. What happens if nobody answers my questions / queries?
    Ans. Normally, that will not happen. In case you feel that your question / query is not answered, then please post the same in the thread – “Any UnAnswered Questions / Queries”.
  • Qs. What happens to the class (or forums) after a course is over? Can you keep it open for a few more days so that students can complete and discuss too?
    Ans. The course and its forum is open for a month after the last day of the course.

Remember, the idea is to have fun learning Ruby.

Technorati Tags: , , ,


(Powered by LaunchBit)

Ruby 1.8.7-p370

Peter and Jason talk about new Ruby 1.8.7-p370, the new RSpec expectation syntax, and more.

Deploying with JRuby: Deliver Scalable Web Apps using the JVM

Deploying with JRuby: Deliver Scalable Web Apps using the JVM now in print

GoogleAnalyticsProxy – now minified

It’s been several years since I released GoogleAnalyticsProxy, which allows our team to test their GA event/click/view tracking during the development phases of our project. Today, I pushed a quick update to it with a minified version of the JavaScript so that there is a smaller footprint.

<p>For more information on how we use it, read my older post, <a href="http://www.robbyonrails.com/articles/2009/11/01/googleanalyticsproxy-for-development-environment-tracking-events-in-google-analytics">Tracking Google Analytics events in development environment with GoogleAnalyticsProxy</a>.</p><div class="feedflare">