Git Push Heroku Master: Now 40% Faster

Flow is an important part of software development. The ability to achieve flow during daily work makes software development a uniquely enjoyable profession. Interruptions in your code/test loop make this state harder to achieve. Whether you are running unit tests locally, launching a local webserver, or deploying to Heroku there's always some waiting and some interruption. Every second saved helps you stay in your flow.

We’ve been working on reducing the time it takes to build your code on Heroku. Read through this post for details on the process we used to make builds
fast, or check out the end result from the graph below:

heroku_build_speed@2x

Let's take a look at our process in delivering these improvements further.

It all starts with instrumentation

Every speed improvement effort starts with visiblity. We collected detailed
metrics about every part of our build infrastructure and in each of our
buildpacks. These metrics allowed us to see where time was being spent. We
were also able to measure how each update we made impacted the build phase it
was meant to improve.

Improvements to our build infrastructure

With detailed timing data in hand, we were able to make changes to how our
build fleet and git service work. These included additional caching, improved
mechanisms for file transfers and storage
,
and providing better user feedback to let you know what is happening during your builds.

One crucial step in the build process that had room for improvement was the
creation of the slug
archive. We found this was taking an appreciable amount of time and were able
to improve it by using pigz which is a parallelized
implementation of gzip.

Improvements to the efficiency of individual buildpacks

In addition to the changes to the build process, we had an opportunity for
each of our language specialists to dig in and optimize their specific
buildpacks. This included a range of improvements from being more efficient
about how dependencies are downloaded, to making better use of the build
cache, to pre-fetching common dependencies.

This process involved many changes across all buildpacks. Some of the most
significant are highlighted in the Heroku
changelog
:

The results

We've achieved improvements in build time across all languages on Heroku.
Although we looked at much more detailed metrics while changes were being
made, our target was improving build times at all percentiles on a per
language basis. We’ve showed you the 50th
percentile or median because it
proved to be a good proxy for the other metrics in this case.

Being able to deploy code and iterate on ideas quickly is a big part of
developer happiness. We will continue to use the metrics we’ve created to look
for more ways to improve build and deployment speed. As always, we welcome
your feedback on how we can improve your experience and help you maintain your
flow.

The Balance of Convention and Innovation

In which I post an enjoyable quote and some thoughts on standards versus innovation.

Adam Wiggins just posted a few paragraphs by Tim Lind from his article titled Innovation in Database Technology. Rarely do I just post direct quotes here, but I really thought the paragraphs were insightful, so I’ll share them for those that haven’t had the pleasure yet.

That, is why we have really moved away from sql, it is not for any specific approach to scalability or data storage, but rather just the ability to free ourselves from the standardized ideas encapsulated in the standard query language.

I’m going to say it again, moving away from sql allows us to innovate. I’m sure no one will have ill feeling towards the notion of innovation, and standardization is almost the exact opposite, it is the crystallization of previous innovation, so of course it would be what we stand against.

We do not necessarily stand against any specific idea encapsulated by the sql standardization, rather we just choose to open ourselves up to investigating the elements of the system for the sake of making design decisions which provide innovative solutions.

You can’t innovate in a box of standards, thus the “think outside the box” saying. For whatever reason, I had not thought about the step back from SQL from this perspective.

Several people have asked me what issues I ran into switching to MongoDB. The biggest issue I ran into was freeing my mind from the standards and conventions that swaddled me in bed at night. Conventions and standards are great, but they do seem to be at odds with innovation.

We definitely do not want to be in a constant state of innovation, as I am sure that would lead to chaos. On the other hand, if we always stick to conventions and standards, we will never push forward. There has to be balance. After years of the same with regards to databases, it great to see the new ideas over the past year or so.

Thanks to Tim for the observations and Adam for putting them on my radar.

Getting A Grip on GridFS

In which I announce some tweaks I made to a fun plugin which makes storing files in GridFS as easy as 1, 2, 3.

I know it is almost Christmas and your minds are beginning to turn to gifts and family, but hang with me for a few more minutes. My buddy Jon pointed me to Grip yesterday. I liked the idea but not all of the implementation. I’ve been doing some GridFS stuff lately so I decided to take some time and tweak it to be more what I need.

You can get my fork of Grip on Github. Below is a simple example of how it works (you could also check out the tests).

class Foo
  include MongoMapper::Document
  include Grip

  has_grid_attachment :image
  has_grid_attachment :pdf
end

image = File.open('foo.png', 'r')
pdf = File.open('foo.pdf', 'r')
foo = Foo.create(:image => image, :pdf => pdf)

foo.image # contents read from gridfs for serving from rack/metal/controller
foo.image_name # foo.jpg
foo.image_content_type # image/png
foo.image_size # File.size of the file
foo.image_path # path in grid fs foo/image/:id where :id is foo.id

The main changes I made were to store name, size and content_type along with the path to the file. I also made it so those are assigned when the file is set so that they can be used in validations. Some other ideas I have for the plugin are adding image resizing with MojoMagick and common validations (much like most of the file upload plugins out there).

What I really like about Grip is its simplicity. A single, small file, an include, a call to has_grid_attachment, and you can start storing files. The great thing about working on this is it gave me some wonderful ideas for how to standardize the process of creating and declaring MongoMapper plugins.

I’ll be adding those to ideas to MongoMapper over the next few weeks as I get time and I think people are going to be stoked about what I’ve come up with (thanks to Brandon for brainstorming the plugin API with me).

Behind the Scenes

As I said, behind the scenes, Grip uses GridFS (more on GridFS), Mongo’s specification for storing files in Mongo.

The good news is that the API for storing files in GridFS using Ruby is nearly identical to using Ruby’s File class. Unfortunately, that is also the bad news, in my opinion, as I find Ruby’s File open, read and close a bit awkward. Here are a few examples pulled almost directly from Grip:

# write foo.jpg to grid fs
file = File.read('foo.jpg')
GridFS::GridStore.open(database, 'foo.jpg', 'w', :content_type => 'image/jpg') do |f|
  f.write(file)
end

# read foo.jpg from grid fs
GridFS::GridStore.read(database, 'foo.jpg')

# delete foo.jpg from grid fs
GridFS::GridStore.unlink(database, 'foo.jpg')

Not horrible but not beautiful. There is a bunch of GridFS related code on Github. To highlight a few interesting ones:

That is all for now. Enjoy the GridFS goodness and the future hope of an awesome MongoMapper plugin interface.

Twizzle Your Deplizzles

In which I show how to easily update Twitter with deploy notices that include the environment and revision.

Steve and I have a Twitter account that we send all our commits to. It is all handled by Github and both of us find it really handy. Rain or shine, we get commit updates on our phone which is great for staying in the loop.

For a little while now, I’ve been wanting to add deploy notices to this twitter account and tonight I finally got around to it. It was pretty easy, but I’ll post the code here to save others time. I went the no dependencies route (even though I created the Twitter gem).

set :twitter_username, 'username'
set :twitter_password, 'password'

namespace :twitter do
  task :update do
    require 'open-uri'
    require 'net/http'

    url = URI.parse('http://twitter.com/statuses/update.xml')
    request = Net::HTTP::Post.new(url.path)
    request.basic_auth twitter_username, twitter_password
    request.set_form_data('status' => "Deployed #{current_revision[0..6]} to #{rails_env}")

    begin
      response = Net::HTTP.new(url.host, url.port).start do |http|
        http.open_timeout = 10
        http.request(request)
      end
      puts 'Deploy notice sent to twitter'
    rescue Timeout::Error => e
      puts "Timeout after 10s: Seems like Twitter is down." 
      puts "Use \"cap twitter:update\" to update Twitter status later w/o deploying" 
    end
  end
end

after "deploy", "twitter:update"

Just drop this in your deploy file and update the username and password. Notice that I even included the git revision and rails environment so you can easily see what was last deployed and where to. You end up with a nice little message like this:

Deployed 09ea23f to staging

Very handy!

Why I think Mongo is to Databases what Rails was to Frameworks

In which I divulge the awesomeness of MongoDB that we have used to build Harmony.

Strong statement, eh? The more I work with Mongo the more I am coming around to this way of thinking. I tell no lie when I say that I now approach Mongo with the same kind of excitement I first felt using Rails. For some, that may be enough, but for others, you probably require more than a feeling to check out a new technology.

Below are 7 Mongo and MongoMapper related features that I have found to be really awesome while working on switching Harmony, a new website management system by my company, Ordered List, to Mongo from MySQL.

Harmony

1. Migrations are Dead

Remember the first time you created and ran a migration in Rails. Can you? Think back to the exuberance of the moment when you realized tempting fate on a production server was a thing of the past. Well I have news for you Walter Cronkite, migrations are so last year.

Yep, you don’t migrate when you want to add or remove columns with Mongo. Heck, you don’t even add or remove columns. Need a new piece of data? Throw a new key into any model and you can start adding data to it. No need to bring your app to a screeching halt, migrate and then head back to normal land. Just add a key and start collecting data.

2. Single Collection Inheritance Gone Wild

There are times when inheritance is sweet. Let’s take Harmony for example. Harmony is all about managing websites. Websites have content. Content does not equal pages. Most website management tools are called content management systems and all that means is that you get a title field and a content field. There, you can now manage content. Wrong!

Pages are made up of content. Each piece of content could be as tiny as a number or as large a massive PDF. Also, different types of pages behave differently. Technically a blog and a page are both pages, but a page has children that are most likely ordered intentionally, whereas a blog has children that are ordered by publish date.

So how did Mongo help us with this? Well, we created a base Item model. Sites have many items. Items have many custom pieces of data. So, we have an Item model that acts as the base for our Page, Blog, Link, BlogPost and such models. Then each of those defines specific keys and behaviors that they do not have in common the other items.

By using inheritance, they all share the same base keys, validations, callbacks and collection. Then for behaviors and keys that are shared by some, but not all, we are creating modules and including them. One such module is SortableItem. This gets included in Page, Blog and Link as those can all be sorted and have previous and next items. The SortableItem module defines a position key and keeps the position order in check when creating and destroying items that include it. Think of it as acts_as_list.

This has been so handy. Steve was building the doc site and said he wished he had a link type, something that shows up in the navigation, but cross links to another section or another site. I was like, so make it! Here it is in all its glory.

class Link < Item
  include SortableItem

  key :url, String, :required => true, :allow_blank => false

  def permalink
    Harmony.escape_url(title)
  end
end

Yep, barely any code. We inherit from item, include the sortable attributes, define a new key named url (where the link should go to) and make sure the permalink is always set to the title. Nothing to it. This kind of flexibility is huge when you get new feature ideas.

All these completely different documents are stored in the same collection and follow a lot of the same rules but none of them has any more data stored with it than is absolutely needed. No creating a column for any key that could be in any row. Just define the keys that go with specific document types. Sweet!

3. Array Keys

Harmony has sites and users. Users are unique across all of Harmony. One username and password and you can access any specific site or all sites of a particular account. Normally this would require a join table, maybe even some polymorphism. What we decided to do is very simple. Mongo natively understands arrays. Our site model has an array key named authorizations and our Account model has one named memberships. These two array keys store arrays of user ids. We could de-normalize even more and just have a sites array key on user, but we decided not to.

class Site
  include MongoMapper::Document
  key :authorizations, Array, :index => true

  # def add_user, remove_user, authorized?
end

class Account
  include MongoMapper::Document
  key :memberships, Array, :index => true

  # def add_user, remove_user, member?
end

What is cool about this is that it is still simple to get all the users for a given site.

class Site
  def users
    user_ids = [authorizations, memberships].flatten.compact.uniq
    User.all(:id => user_ids, :order => 'last_name, first_name')
  end
end

The sweet thing about this is that not only does Mongo know how to store arrays in documents, but you can even index the values and perform efficient queries on arrays.

Eventually, I want to roll array key stuff like this into MongoMapper supported associations, but I just haven’t had a chance to abstract them yet. Look for that on the horizon.

4. Hash Keys

As if array keys were not enough, hash keys are just as awesome. Harmony has a really intelligent activity stream. Lets face it, most activity streams out there suck. Take Github’s for example. I will pick on them because I know the guys and they are awesome. They are so successful, they can take it. 🙂

It may be handy that I can see every single user who follows or forks MongoMapper, but personally I would find it way more helpful if their activity stream just put in one entry that was more like this.

“14 users started watching MongoMapper today and another 3 forked it. Oh, and you had 400 pageviews.”

Am I right? Maybe I have too many projects, but their feed is overwhelming for me at times. What we did to remedy this in Harmony is make the activity stream intelligent. When actions happen, it checks if the same action has happened recently and just increments a count. What you end up with are things in the activity stream like:

“Mongo is to Databases what Rails was to Frameworks was updated 24 times today by John Nunemaker.”

On top of that, we use a hash key named source to store all of the attributes from the original object right in the activity stream collection. This means we do 0, yes 0, extra queries to show each activity. Our activity model looks something like this (obviously this is really pared down):

class Activity
  include MongoMapper::Document
  key :source, Hash
  key :action, String
  key :count, Integer, :default => 1
end

Then, we define an API in that model to normalize the different attributes that could be there. For example, here is the title method:

class Activity
  def title
    source['title'] || source['name'] || source['filename']
  end
end

In order to determine if a new action is already in the stream and just needs to be incremented, we can then use Mongo’s ability to query inside hashes with something like this:

Activity.first({
  'source._id'   => id, 
  :action        => 'updated', 
  :created_at.gt => Time.zone.now.beginning_of_day.utc
})

How fricken sweet is that? Major. Epic.

5. Embedding Custom Objects

What is that you say? Arrays and hashes just aren’t enough for you. Well go fly a kite…or just use an embedded object. When Harmony was powered by MySQL (back a few months ago), we had an Item model and an ItemAttributes (key, value, item_id) model.

Item was the base for all the different types of content and item attributes were how we allowed for custom data on items. This meant every time we queried for an item, we also had to get its attributes. This isn’t a big deal when you know all the items up front as you can eager load some stuff, but we don’t always know all the items that are going to be shown on a page until it happens.

That pretty much killed the effectiveness of eager loading. It also meant that we just always got all of an item’s attributes each time we got an item (and performed the queries to get the info), just in case one of those attributes was being used (title and path are on all items).

With Mongo, however, we just embed custom data right with the item. Anytime we get an item, all the custom data comes with it. This is great as there is never a time where we would get an attribute without the item it is related to. For example, here is part of an item document with some custom data in it:

{
  "_id"   =>..., 
  "_type" =>"Page", 
  "title" =>"Our Writing", 
  "path"  =>"/our-writing/", 
  "data"  =>[
    {"_id" =>..., "file_upload"=>false, "value"=>"", "key"=>"content"}, 
    {"_id" =>..., "file_upload"=>true, "value"=>"", "key"=>"pic"}
  ], 
}

Now anytime we get an item, we already have the data. No need to query for it. This alone will help performance so much in the future, that it alone had the weight to convince us to switch to Mongo, despite being almost 90% done in MySQL.

The great part is embedded objects are just arrays of hashes in Mongo, but MongoMapper automatically turns them into pure ruby objects.

class Item
  include MongoMapper::Document

  many :data do
    def [](key)
      detect { |d| d.key == key.to_s }
    end
  end
end

class Datum
  include MongoMapper::EmbeddedDocument

  key :key, String
  key :value
end

Just like that, each piece of custom data gets embedded in the item on save and converted to a Datum object when fetched from the database. The association extension on data even allows for getting data by its key quite easily like so:

Item.first.data['foo'] # return datum instance if foo key present

6. Incrementing and Decrementing

A decision we made the moment we switched to Mongo was to take advantage of its awesome parts as much as we could. One way we do that is storing published post counts on year, month and day archive items and label items. Anytime a post is published, unpublished, etc. we use Mongo’s increment modifier to bump the count up or down. This means that there is no query at all needed to get the number of posts published in a given year, month or day or of a certain label if we already have that document.

We have several callbacks related to a post’s publish status that call methods that perform stuff like this under the hood:

# ids is array of item ids
conditions = {:_id => {'$in' => ids}}

# amount is either 1 or -1 for increment and decrement
increments = {'$inc' => {:post_count => amount}}

collection.update(conditions, increments, :multi => true)

For now, we drop down the ruby driver (collection.update), but I have tickets (inc, the rest) to abstract this out of Harmony and into MongoMapper. Modifiers like this are super handy for us and will be even more handy when we roll out statistics in Harmony as we’ll use increments to keep track of pageviews and such.

7. Files, aka GridFS

Man, with all the awesome I’ve mentioned above, some of you may be tired, but I need you to hang with me for one more topic. Mongo actually has a really cool GridFS specification that is implemented for all the drivers to allow storing files right in the database. I remember when storing files in the database was a horrible idea, but with Mongo this is really neat.

We currently store all theme files and assets right in Mongo. This was handy when in development for passing data around and was nice for building everything in stage before our move to production. When we were ready to move to production, we literally just dumped stage and restored it on production. Just like that all data and files were up and running.

No need for S3 or separate file backup processes. Just store the files in Mongo and serve them right out of there. We then heavily use etags and HTTP caching and intend on doing more in the future to make sure that serving these files stays performant, but that is for another day. 🙂 As of now, it is plenty fast and sooooo convenient.

Conclusion

We have been amazed at how much code we cut out of Harmony with the switch from MySQL to Mongo. We’re also really excited about the features mentioned above and how they are going to help us grow our first product, Harmony. I can’t imagine building some of the flexibility we’ve built into Harmony or some of the ideas we have planned for the future with a relational database.

I am truly as excited about the future of Mongo as I once was (and still am) about the future of Rails.

Config So Simple Your Mama Could Use It

In which I clog a bit of code for simple application configuration.

Tonight, Kastner asked me if I had anything to do some simple configuration for something he was working on. I’ve got a simple module and yaml file that I’ve been using so I gist’d it. It then occurred to me that I might as well share it here too.

The Yaml

Below is an example of the yaml file. Basically, I setup some defaults and then customize each environment as needed.

DEFAULTS: &DEFAULTS
  email: no-reply@harmonyapp.com
  email_signature: |
    Regards,
    The Harmony Team

development:
  domain: harmonyapp.local
  <<: *DEFAULTS

test:
  domain: harmonyapp.com
  <<: *DEFAULTS

production:
  domain: harmonyapp.com
  <<: *DEFAULTS

The Module

The module can read and write to the config and even loads the Yaml file the first time you try to read a configuration key.

module Harmony
  # Allows accessing config variables from harmony.yml like so:
  # Harmony[:domain] => harmonyapp.com
  def self.[](key)
    unless @config
      raw_config = File.read(RAILS_ROOT + "/config/harmony.yml")
      @config = YAML.load(raw_config)[RAILS_ENV].symbolize_keys
    end
    @config[key]
  end

  def self.[]=(key, value)
    @config[key.to_sym] = value
  end
end

If I wanted to get the domain, I would do the following:

Harmony[:domain]

Nothing fancy, but it gets the job done. I just drop the yaml file in config/ and the module in lib/. Obviously, you would rename the module and yaml file to whatever constant you want, such as App or something related to your application’s name. I know there are gems and plugins to do configuration, but when something this simple gets the job done, I figure why bother.

What do you all use for app configuration? What do you like about what you use?

You’re An Idiot For Not Using Heroku

In which I discuss my first experience with Heroku and my second. And how awesome it is.

It is true. You are. Go try it now. That is an order. I can wait for you to come back and finish reading this post. I could end the post now, but I suppose I’ll go on and tell you a bit about my experience with Heroku yesterday.

Formerly a Toy in the Cloud

Wynn and I were talking yesterday about how, back in the day, Heroku seemed like a toy in the cloud. They had a rich code editor and you could magically create and deploy applications that sometimes worked. It was neat, but nothing you would use for anything serious.

A toy they are no more. So what is Heroku? According to their site, Heroku is “fast, frictionless, and maintenance free.” After giving it another look yesterday, I would have to agree.

Heroku

The App

I have a tiny note application that my wife and I use. I use it to mark things to read later and save plain text notes. She uses it to keep track of recipes, tagged with ingredients and whether or not she has made the recipe before. It is nothing fancy, but it serves a purpose for both of us.

Textual

The app formerly ran on Dreamhost (how to deploy rails on DH) and used MySQL. Since I decided not to attend the Notre Dame game yesterday, I had some free time, so I watched football on TV all day and worked on converting this project from MySQL to MongoDB (which is awesome).

Once I finished the conversion, which didn’t take long, I exported the MySQL database as XML using PHPMyAdmin (shutter) and then wrote an import rake task that reconnected the xml in MongoDB (which is awesome).

MongoHQ

I have had a MongoHQ invite for a while now, but hadn’t kicked the tired so I decided now was as good a time as any. Then it occurred to me. Why use Dreamhost when Heroku has a free account and I’m already hosting my database in the sky? Why not go cloud to the max and see how things end up?

Heroku

I logged in with my old Heroku account and did some reading through their amazing docs.

  • I gem installed heroku.
  • heroku created my app using the command line tool.
  • git pushed to heroku remote.

Boom. In less than a minute my app was created and deployed on Heroku. Impressive. Now that isn’t where the story ended. Hosting on Heroku is a bit different.

Config Vars

The first thing I ran into was some config file issues. I found Heroku’s article on config vars and switched my app to work like that. git push and my app was deployed again.

Gems

Now I was missing gems. Back to the docs I went, this time to read about managing gems. I created my .gems manifest and git pushed again. Just like that my app was up and running in the sky.

Conclusion

I made a few more changes to my app over the next few hours and deployed after each one with a simple git push heroku master. Each time, I almost giggled as the normal git messages happened and then out of nowhere, Heroku stepped in and informed me that it was deploying my app and…wait for it…wait for it…that the deploy was finished.

Now that I’ve used it for a tiny app, I’m curious to see what it can do with something larger. I’ll definitely be using Heroku a lot in the future, that much I know for sure. Combined with a hosted MongoDB service, it is absolute glory. MongoDB having their GridFS file store, means that not having write access to a file system on Heroku is no big deal. You don’t even have to setup S3.

I’ll leave you with my tweet from yesterday, summing up my experience.

Created and deployed a MongoDB backed Rails app to Heroku and MongoHQ today. I have witnessed the future.

Anyone else out there using Heroku? What kind of apps have you deployed on it? What have your experiences been? Curious to hear from others.

Know When to Fold ‘Em

In which I relinquish the day to day maintenance of a few of my projects.

I have a lot of projects. Each time I feel pain or inspiration, I’ll whip together a new library and release it as a gem. It is fun and I love it. It is even more fun when people come along and use those projects to do cool stuff. This in turn, inspires me to write more code and release more projects. It is a vicious cycle.

A while back, I caught myself making jokes about how I don’t even use my projects. I can barely remember the last time I actually used HTTParty, HappyMapper, or the Twitter gem. Not too long ago, I came across Dr. Nic’s Future Ruby talk on Living with 1000 open source projects.

<object height=”355″ width=”425″><param /><param /><param /><embed src=”http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=livingwith1000opensourceprojects-novideos-090712054221-phpapp01&#38;stripped_title=living-with-1000-open-source-projects” height=”355″ width=”425″></embed></object>

In the presentation, he says that you should maintain the projects you use everyday and abandon the rest. Good advice. Over the past few months, I have been seeding maintenance and new features to other talented developers for several of my projects.

HTTParty

The first to go was HTTParty. I believe it was the Ruby Hoedown where I ran into Sandro. He mentioned some HTTParty bugs and asked him if he was interested in taking over. He accepted and the last release (0.4.5) was all him.

HappyMapper

Brandon Keepers, a good friend of mine, has a client project that uses HappyMapper, so the fact that he actually uses it made him a logical choice to help with the maintenance of it. He did a bunch of namespace work for the 0.3 release and now has commit rights.

The Twitter Gem

The last gem that was beginning to feel like a burden was the Twitter gem. Wynn Netherland has built several apps that rely on the Twitter gem, so I gave him commit rights and he recently added lists to it.

Conclusion

I can’t say that I am abandoning these projects, as I am sure from time to time I’ll feel inspired and spend some time on them. I just know that I am no good for them if I am not using them. I can’t feel the pain or know what is needed if I am not using the code.

I’m posting about this for two reasons. First and foremost to give some credit to the people who are doing the work now. Second, just setting some expectations that I probably won’t be snappy in responses for these projects as I’m not actively working on them anymore.

More MongoMapper Awesomeness

In which I dish on the latest MongoMapper features like dirty attributes, time zone support, custom data types and dynamic finders.

September was a month of craziness and for the first month in quite a while I did not post here. I promise it hurt me as much as it hurt you. In an effort to get back in the rhythm, I am going to start with an easy article. MongoMapper has been getting a lot of love lately and I thought I would mention some of the awesomeness.

Dynamic Finders

Dynamic finders are so darn handy in ActiveRecord. How many times have you used User.find_by_email and the like? Thankfully David Cuadrado took a stab at it. I took what he started, tested it a bit harder and added it onto document associations as well. This means when you have a document with a many documents association, you can now use dynamic finders that are scoped to that association.

class User
  include MongoMapper::Document

  many :posts
end

class Post
  include MongoMapper::Document
  key :user_id, String
  key :title, String
end

user = User.create
user.posts.create(:title => 'Foo')

# would return post we just created
user.posts.find_by_title('Foo')

Document associations now also have all the normal Rails association methods such as build, create, find, etc.

Logging

The mongo ruby driver added logging support so a few days ago, I added some basic support for accessing and using that logger from within MongoMapper. When you pass a logger instance to the ruby driver, you can access that connections logger instance from MongoMapper.logger like so:

logger = Logger.new('test.log')
MongoMapper.connection = Mongo::Connection.new('127.0.0.1', 27017, :logger => logger)
MongoMapper.logger # would be equal to logger

Tailing the log would give you output like the following:

MONGODB db.$cmd.find({"count"=>"statuses", "query"=>{"project_id"=>"4aceaabed072c4745f0003ca"}, "fields"=>nil})
MONGODB db.$cmd.find({"count"=>"statuses", "query"=>{"project_id"=>"4aceaabed072c4745f0003ce"}, "fields"=>nil})

The nifty part about this is you can setup your Mongo::Connection to use Rails.logger and then all your mongo queries show up in your Rails logs if you have your log level set low enough. This has been very handy for me working on MongoMapper because I can see exactly what MM is sending to Mongo behind the scenes.

Because of this addition, I noticed that every find(:first) was using :order => ’$natural’ which doesn’t allow using indexes and leads to slow queries. I removed the default order so instead it is just a find with a limit of 1, which should help make a few parts perform better.

Dirty Attributes

ActiveRecord’s dirty attributes is such a cool feature that yesterday, I spent a few hours porting it to MongoMapper::Document. Now you can do things like:

class Foo
  include MongoMapper::Document
  key :phrase, String
end

foo = Foo.new
foo.changed? # false
foo.phrase_changed? # false

foo.phrase = 'Dirty!'

foo.changed? # true
foo.phrase_changed? # true
foo.phrase_change # [nil, 'Dirty!']

I’m sure there will be edge cases, but as we find them we can fortify the tests and go from there.

Custom Data Types

With the 0.4 release came the transition from typecasting to custom data types. Now, instead of natively defining typecasting for “allowed” data types, you can have any data type that you like. You just have to do the conversion to and from mongo yourself. Making your own data types is as simple as:

class Foo
  def self.to_mongo(value)
    # convert value to a mongo safe data type
  end

  def self.from_mongo(value)
    # convert value from a mongo safe data type to your custom data type
  end
end

class Thing
  include MongoMapper::Document
  key :name, Foo
end

This means each time the name of Thing is saved to mongo or pulled out of mongo it will be ran through the Foo#to_mongo and Foo#from_mongo to make sure it is exactly what you want it to be.

Out of the box, MongoMapper supports Array, Binary, Boolean, Date, Float, Hash, Integer, String, and Time. You can check out the support file and tests to see how this works.

Time Zones

One not on times, since I mentioned it above is that all times are stored in the datbase as utc now. Also, if you have Time.zone set, all times are converted to the current time zone going to and from the database. This actually turned out to be really easy. We’ll see if I did it all correctly once people start pounding on it I guess. 🙂

Lazy Loading

One thing that I’ve been working on in between other features is making MongoMapper more lazy. I have already made connection, database and collection lazy so MM doesn’t actually create the connection or connection to the database until needed which makes MM work a lot better with Rails.

I still need to make indexes lazy, so that is the next thing to tackle. I’m thinking once that is in, I’ll have something like MongoMapper.ensure_indexes!, similar to DataMapper.auto_migrate!, which actually ensures the indexes exist rather than doing that the second a class loads.

Internal Improvements

Along with all the public features, I have been working on the internals of MM whenever I get a chance. They still need cleaning up, but things are getting better. Along with some refactoring, I did some work to speed the tests up.

The tests were starting to creep up to around 40 seconds which was driving me nuts. I did a bit of work and realized that clearing every collection before every test was causing most of the slowdown so I pruned the functional tests to only clear the collections that were actually used in that test. This cut the time from around 40 seconds to 10. Yep, huge!

Conclusion

There are still rough parts and I would recommend MongoMapper for beginners, but if you can troubleshoot not only your own code but others, MM is in a good place for you. Up until now, I’ve been working on adding features that I needed similar to ActiveRecord, but I am almost to a place where I am going to start adding features to MM that can literally only exist because of MongoDB.

The next month is going to see some really cool things like upserts, modifiers ($set, $inc, $dec, $push, $pull, etc.) and the like make their way into MM. I also have some plans for an identity map implementation. Oooohs and aaaaaahs abound!

Lookin’ on Up…To the East Side

In which I provide an enormous amount of examples to explain Ruby’s method lookup path.

I am currently reading the Well-Grounded Rubyist by David Black. It is a great book and reading it reminds me of things I was confused on when I started in Ruby. One of those things was the path Ruby uses to figure out which method to call when inheritance and mixins are in play.

As I read it last night, I thought I should post about it, so here it goes. Let’s start with a simple class.

class A
  def foo
    puts 'foo in A'
  end
end

A.new.foo 
# foo in A

Inheritance

That was pretty straightforward. Next up lets look at inheritance.

class A
  def foo
    puts 'foo in A'
  end
end

class B < A
end

B.new.foo
# foo in A

Again, straight forward. And if I define foo in B, it calls foo in B as that is first in the lookup path.

class A
  def foo
    puts 'foo in A'
  end
end

class B < A
  def foo
    puts 'foo in B'
  end
end

B.new.foo
# foo in B

What if I wanted to call both foo in B and foo in A? That is where super comes in. It allows you to go up the chain and call methods.

class A
  def foo
    puts 'foo in A'
  end
end

class B < A
  def foo
    super
    puts 'foo in B'
  end
end

B.new.foo
# foo in A
# foo in B

Notice how when we call super foo in A and B is in the output as it called A and then B. You can call super at any point in the method. It doesn’t really matter.

Super

One note on super: if you call super without parentheses, it will call the next method up the chain with the same arguments that were passed in. If, however, you call super with parenthesis, like super(), you have to pass in the arguments you would like to send. This will make more sense with a simple example.

class A
  def foo(message)
    puts 'foo in A'
    puts "#{message} in A" 
  end
end

class B < A
  def foo(message)
    super
    puts 'foo in B'
    puts "#{message} in B" 
  end
end

B.new.foo('heyyooo! ')
# foo in A
# heyyooo!  in A
# foo in B
# heyyooo!  in B

Not that foo has the same signature in A and B so calling super automatically passed the message argument in B’s foo to A. What if we wanted to take another argument in B?

class A
  def foo(message)
    puts 'foo in A'
    puts "#{message} in A" 
  end
end

class B < A
  def foo(message, bar)
    super
    puts 'foo in B'
    puts "#{message} in B" 
    puts bar
  end
end

B.new.foo('heyyooo! ', 'baz')
# ArgumentError: wrong number of arguments (2 for 1)

No dice! Remember now what I mentioned about super with parenthesis. Lets put that in action.

class A
  def foo(message)
    puts 'foo in A'
    puts "#{message} in A" 
  end
end

class B < A
  def foo(message, bar)
    super(message)
    puts 'foo in B'
    puts "#{message} in B" 
    puts "#{bar} in B" 
  end
end

B.new.foo('heyyooo! ', 'baz')
# foo in A
# heyyooo!  in A
# foo in B
# heyyooo!  in B
# baz in B

Botta bing bang boom! That is more like what we want.

Mixins

Ok, now that we have a grasp on looking up methods in normal classes and classes that have a superclass, lets throw mixins into the mix.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

class A
  include Fooish
end

A.new.foo
# foo in Fooish

So, the foo method was not defined in A but was defined in Fooish and it worked just as we expected. What if we define the method both in A and Fooish? Lets give it a try.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

class A
  include Fooish

  def foo
    puts 'foo in A'
  end
end

A.new.foo
# foo in A

Groovy. That is pretty straightforward as well. Now, remember super? Yeah, super is our friend. Lets say you want to call the method in A and then in Fooish.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

class A
  include Fooish

  def foo
    super
    puts 'foo in A'
  end
end

A.new.foo
# foo in Fooish
# foo in A

Ding! Pretty cool right. So mixins are just that, mixins. They get mixed in between your class and its superclass. Speaking of superclass, lets do check out inheritance AND mixins.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

class A
  include Fooish
end

class B < A
end

B.new.foo
# foo in Fooish

B gets foo from Fooish which is mixed into A, B’s superclass. Cool. Lets make it a bit crazier.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

class A
  include Fooish
end

class B < A
  def foo
    super
    puts 'foo in B'
  end
end

B.new.foo
# foo in Fooish
# foo in B

Again, things are working like we would expect based on what we have learned above. Lets put some methods all over the place so we can see the order of what is happening.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

class A
  include Fooish

  def foo
    super
    puts 'foo in A'
  end
end

class B < A
  def foo
    super
    puts 'foo in B'
  end
end

B.new.foo
# foo in Fooish
# foo in A
# foo in B

Again, this was a bit more complex, but we get the order we would expect. What if we have 2 mixins?

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

module Barish
  def foo
    puts 'foo in Barish'
  end
end

class A
  include Fooish
  include Barish
end

A.new.foo
# foo in Barish

Ok, so we got foo in Barish, so that means that the lookup is in reverse order of how modules are included. Maybe the more straightforward way of saying that is the last module included is going to ding first. The kind of interesting thing is that you can even use super in your modules.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

module Barish
  def foo
    super
    puts 'foo in Barish'
  end
end

class A
  include Fooish
  include Barish
end

A.new.foo
# foo in Fooish
# foo in Barish

Pretty cool. I would not really recommend doing this as if Barish was the only module included and you called foo, you would get an error. Lets do it just so we can see.

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

module Barish
  def foo
    super
    puts 'foo in Barish'
  end
end

class A
  include Barish
end

A.new.foo
# NoMethodError: super: no superclass method ‘foo’

Still, pretty neat how it works. What if we mixin in the same module twice?

module Fooish
  def foo
    puts 'foo in Fooish'
  end
end

module Barish
  def foo
    puts 'foo in Barish'
  end
end

class A
  include Fooish
  include Barish
  include Fooish
end

A.new.foo
# foo in Barish

As you can see it had no effect. If the module gets included again it doesn’t change the original lookup order. Lets review the method lookup path.

  1. the class
  2. modules mixed in, in reverse order
  3. superclass (inheritance)
  4. modules mixed in to superclass, in reverse order
  5. rinse and repeat all the way to Object in the Ruby 1.8 series and BasicObject in Ruby 1.9

Conclusion

Ruby’s method lookup path is very straightforward, but confusing at first. Once you learn how it works you can really take advantage of it to write more clean and structured code. Hope this was helpful. Go buy David’s great book for even more goodies like this.

MongoMapper Indy.rb Presentation

In which I post slides and audio from my last MongoDB presentation.

Last Wednesday I was invited to present on MongoDB and MongoMapper at the Indianapolis ruby group. I promised them I would post the slides so they could get to the links and such.

Slides

The slides are very similar to my Grand Rapids presentation on MongoDB. but the actual talk was different.

<object height=”355″ width=”425″><param /><param /><param /><embed src=”http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=mongodbindyrb-090819100052-phpapp01&#38;stripped_title=mongodb-indyrb” height=”355″ width=”425″></embed></object>

Audio

On top of the slides, there is also some rough audio thanks to David Jones. You can listen to it here. I think in total it is around an hour. I’m to lazy to sync the slides and the audio so you’ll have to do that your self. 🙂

I made a joke that CouchDB sucks. It was a joke. I actually like Couch a lot, I just like Mongo better. 🙂

Also, there is an interview with Mike Dirolf on the strange loop conference blog about MongoDB with Python and Ruby that may be of interest to you.

That is all for now. Enjoy!

Patterns Are Not Scary: Method Missing Proxy

In which I show how to create a method missing proxy and provide some example uses in the wild.

Method missing proxy? Ooooh! Sounds scary, right? I got news for you Walter Cronkite, it’s not. Lets start with the definition of proxy, according to Wikipedia.

Definition

A proxy, in its most general form, is a class functioning as an interface to something else.

An interface to something else. That sounds easy enough. You might be thinking that you have never used a proxy, but if you are reading this blog, you are wrong. Chances are you have used Rails, and if you have used Rails, chances are you have used has_many or some other ActiveRecord association, all of which are implemented using proxies under the hood.

Creating Your Own

Now that we have definition out of the way and have confirmed your use of proxies, let’s make one! Yay! The people rejoice! The basic idea of a proxy is a class that is an interface to something else. Lets call something else subject from now on. In order to get started, we’ll make a new proxy that has a subject.

class Proxy
  def initialize(subject)
    @subject = subject
  end
end

proxied_array = Proxy.new([1,2,3])
puts proxied_array.size
# NoMethodError: undefined method ‘size’

FAIL! Our proxy has a subject (the array), but does not proxy anything yet. In order to proxy up the girl, lets throw in some method missing magic.

class Proxy
  def initialize(subject)
    @subject = subject
  end

  private
    def method_missing(method, *args, &block)
      @subject.send(method, *args, &block)
    end
end

proxied_array = Proxy.new([1,2,3])
puts proxied_array.size # 3

Method missing takes 3 arguments: the method called, the arguments passed to the method and a block if one is given. With just that tiny method missing addition, we can now do fun things like this:

proxied_array = Proxy.new([1,2,3])
puts proxied_array.size # 3
puts proxied_array[0] # 1
puts proxied_array[1] # 2
puts proxied_array[2] # 3
puts proxied_array.select { |a| a > 1 }.inspect # [2, 3]
proxied_array << 4
puts proxied_array.size # 4
puts proxied_array[3] # 4

Just like that our proxied array behaves just like the original array. Well, almost like the original array.

puts proxied_array.class # Proxy

BlankSlate and BasicObject

Hmm, that is not quite what you would expect. We told the proxy to send everything to the subject, so it should output Array, not Proxy as the class, right? The problem is that any new class automatically has some methods included with it. In order for our Proxy class to be a true proxy, we need to remove those methods as well. In the Ruby 1.8 series, this is often done by defining a BlankSlate object which removes those methods and then have our Proxy inherit from BlankSlate.

class BlankSlate #:nodoc:
  instance_methods.each { |m| undef_method m unless m =~ /^__|instance_eval|object_id/ }
end

class Proxy < BlankSlate
  def initialize(subject)
    @subject = subject
  end

  private
    def method_missing(method, *args, &block)
      @subject.send(method, *args, &block)
    end
end

proxied_array = Proxy.new([1,2,3])
puts proxied_array.class # Array

Yay! Now we in fact get Array as one would expect. The great news is that Ruby 1.9 comes with a class like this already named BasicObject. The easy way to make this work with Ruby 1.8 and Ruby 1.9 is to just define BasicObject if it does not exist and then inherit from BasicObject, instead of dealing with BlankSlate.

class BasicObject #:nodoc:
  instance_methods.each { |m| undef_method m unless m =~ /^__|instance_eval/ }
end unless defined?(BasicObject)

class Proxy < BasicObject
  def initialize(subject)
    @subject = subject
  end

  private
    def method_missing(method, *args, &block)
      @subject.send(method, *args, &block)
    end
end

proxied_array = Proxy.new([1,2,3])
puts proxied_array.class # Array

Just like that our proxy is a full fledged proxy and it works with Ruby 1.8 and 1.9.

Example: MongoMapper Pagination

So other than ActiveRecord where else can you check out some proxies in the wild? In MongoMapper, pagination uses a method missing proxy. When someone uses paginate, instead of find, I wanted the result that was returned to also function much like WillPaginate::Collection does, but I didn’t want to inherit from Array.

You can view the pagination proxy on github. The paginate method that uses it looks like this:

def paginate(options)        
  per_page      = options.delete(:per_page)
  page          = options.delete(:page)
  total_entries = count(options[:conditions] || {})

  collection = Pagination::PaginationProxy.new(total_entries, page, per_page)

  options[:limit] = collection.limit
  options[:offset]  = collection.offset

  collection.subject = find_every(options)
  collection
end

Just like that, paginate returns results just like find, but also includes methods for total_pages, previous and next pages, total_entries and the like.

Example: HTTParty Response

In HTTParty, at first I just returned a ruby hash that was the parsed xml or json. Then, people started begging for response codes and headers, so I went with a Response proxy that looks like this:

module HTTParty
  class Response < BlankSlate #:nodoc:
    attr_accessor :body, :code, :message, :headers
    attr_reader :delegate

    def initialize(delegate, body, code, message, headers={})
      @delegate = delegate
      @body = body
      @code = code.to_i
      @message = message
      @headers = headers
    end

    def method_missing(name, *args, &block)
      @delegate.send(name, *args, &block)
    end
  end
end

Now I just pass the parsed response, the codes, headers, and such to Response.new and the people who want that information get it and those who don’t have no API change to wrestle with.

Conclusion

Hope this little primer on the Proxy pattern, specifically using Ruby’s method missing, is helpful. I also hope that because of this you check out some of the other great patterns that are out there. I know I avoided them for far too long. When applied correctly, they really lead to elegant solutions.

Getting Started With MongoMapper and Rails

In which I show how to get up and running with MongoMapper and Rails in both text and video formats.

Warning: This is currently out of date as it was based on an older version of MongoMapper.

I have had a few requests for tips on getting started with MongoDB and Rails so I thought I would drop some quick knowledge. It is actually really easy to get going, but Rails always does everything for you so when something comes along that doesn’t, you sometimes feel lost.

rails mongomapper_demo
cd mongomapper_demo

Now that your Rails shell is created, let’s add MongoMapper to the mix. Open up environment.rb. First, configure it as a gem, then remove ActiveRecord from the mix, and lastly, point it at a database.

Rails::Initializer.run do |config|
  config.gem 'mongomapper', :version => '>= 0.2.1'
  config.frameworks -= [:active_record]
end

MongoMapper.database = "myappname-#{Rails.env}"

As of now there are no fancy generators for your models (gasp!) so you can just create a new file in app/models and an accompanying file in test/unit. Or, if you like, you can use script generate like so and just adjust your model after it is created (until I or some kind soul gets around to generators).

script/generate model Note --skip-migration

Then you can just change your app/models/note.rb file to be something like this:

class Note
  include MongoMapper::Document

  key :title, String
  key :body, String
end

I’d like to say there is more to it, but there isn’t. 🙂 Don’t worry about creating your database or migrating it. It all happens on the fly.

Video

For those that prefer visual learning, I’ve even whipped together a short screencast where I install MongoDB on OSX, fire it up, and build a really basic note app. Enjoy!

MongoMapper Demo

Uploadify and Rails 2.3

In which I show how to reach the promised land of multiple file uploads using Uploadify, a spot of rack middleware and Rails 2.3.

A few weeks back we (Steve and I) added multiple asset upload to Harmony using Uploadify. If you are thinking that sounds easy, you would be sorely mistaken. Uploadify uses flash to send the files to Rails. This isn’t a big deal except that we are using cookie sessions on Harmony and flash wasn’t sending the session information with the files, so to Rails the files appeared as unauthenticated.

We found multiple articles online showing how to get this working, but none of them worked as promised. At the time Harmony was running on Rails 2.2. Knowing that rack was probably the best way to solve our issue, we updated to 2.3, which was pretty painless, and started hacking. Be sure to check out a quick screencast of the finished product at some point as well.

Add Uploadify

First, we added the uploadify files and the following js to the assets/index view. We actually set many more options, but these are the ones pertinent to this article. Script is the url to post the files to. fileDataName is the name of the file field you would like to use. scriptData is any additional data you would like to post to the url.

<%- session_key_name = ActionController::Base.session_options[:key] -%>
<script type="text/javascript">
  $('#upload_files').fileUpload({  
      script          : '/admin/assets',
      fileDataName    : 'asset[file]',
      scriptData      : {
        '<%= session_key_name %>' : '<%= u cookies[session_key_name] %>',
        'authenticity_token'  : '<%= u form_authenticity_token if protect_against_forgery? %>'
      }
  });
</script>

As you can see, it adds the session key and the cookie value along with the authenticity token as data that gets sent with the file. We then use a piece of rack middleware to intercept the upload and properly set the Rails session cookie.

Add Some Middleware

We created an app/middleware directory and added it to the load path in environment.rb.

%w(observers sweepers mailers middleware).each do |dir|
  config.load_paths << "#{RAILS_ROOT}/app/#{dir}" 
end

Next, we dropped flash_session_cookie_middleware.rb in the app/middleware directory.

require 'rack/utils'

class FlashSessionCookieMiddleware
  def initialize(app, session_key = '_session_id')
    @app = app
    @session_key = session_key
  end

  def call(env)
    if env['HTTP_USER_AGENT'] =~ /^(Adobe|Shockwave) Flash/
      params = ::Rack::Utils.parse_query(env['QUERY_STRING'])

      unless params[@session_key].nil?
        env['HTTP_COOKIE'] = "#{@session_key}=#{params[@session_key]}".freeze
      end
    end

    @app.call(env)
  end
end

And, finally, we added the following to our session_store.rb initializer.

ActionController::Dispatcher.middleware.insert_before(
  ActionController::Session::CookieStore, 
  FlashSessionCookieMiddleware, 
  ActionController::Base.session_options[:key]
)

This inserts our middleware before ActionController’s CookieStore so that everything will just work as expected.

Assign the Content Type

The only other thing we needed to do was manually set the content type of the file. We were using paperclip (which is awesome) to do uploads, so something like this did the trick:

@asset.file_content_type = MIME::Types.type_for(@asset.original_filename).to_s

Be sure to add the mime type gem to your environment.rb file as well.

config.gem 'mime-types', :lib => 'mime/types'

But Why?

So why did we go through all this trouble to allow multiple uploads at once? Taking a quick look at the finished product might help. I didn’t record the entire screen in the video, as we haven’t actually released Harmony yet (ooooh secrets!), but I did capture enough that you can see the awesome uploads in action.

Harmony Multi-Uploading of Assets

Hope this spares some other poor soul attempting the same thing some time.

Code Review: Weary

In which I provide critique on Mark Wunsh’s new gem Weary.

Let me start with the fact that I’m not picking on Weary. Mark Wunsch, the author of Weary, emailed me just over a month ago and asked if I could take a look at the code and provide any tips or pointers. I haven’t performed a code review for someone that I don’t know, but I thought what the heck.

I spent about 30 minutes or so looking through his code and typing suggestions into an email. When I was done it was one of the longer emails I’ve written, but I sent it to Mark anyway. He liked the suggestions and has already implemented a few of them so I asked him if I could turn it into a post here. He obliged and you all shall now suffer through it. Muhahahahaha!

I’ll try to post snippets of the code or link to the file before each of my comments (which I’ll cut straight from the email I sent him). Please note that what I suggest are just that, suggestions. They aren’t rules by any means and I’ve been wrong once or twice in my life. Maybe. Let’s get started.

Don’t Repeat Yourself

Weary methods declare, post, put and delete are very similar. I’d maybe abstract them out into a builder method and make them one line calls that just pass on name, verb and block. Below are the methods. You can see the repetition pretty quickly. The only difference between them is the verb (:get, :post, :put, :delete).

module Weary
  def declare(name)
    resource = prepare_resource(name,:get)
    yield resource if block_given?
    form_resource(resource)
    return resource
  end
  alias get declare

  def post(name)
    resource = prepare_resource(name,:post)
    yield resource if block_given?
    form_resource(resource)
    return resource
  end

  def put(name)
    resource = prepare_resource(name,:put)
    yield resource if block_given?
    form_resource(resource)
    return resource
  end

  def delete(name)
    resource = prepare_resource(name,:delete)
    yield resource if block_given?
    form_resource(resource)
    return resource
  end
end

Weary::Request#request is repeating a bit. Each option in the case statement is instantiating a class with a request uri. You could wrap up the class in another method, like request_class or something and then just do request_class.new(@uri.request_uri) in the actual request method. Not sure why I like this, just makes methods smaller and again smaller methods are easier to test.

def request
  prepare = case @http_verb
    when :get
      Net::HTTP::Get.new(@uri.request_uri)
    when :post
      Net::HTTP::Post.new(@uri.request_uri)
    when :put
      Net::HTTP::Put.new(@uri.request_uri)
    when :delete
      Net::HTTP::Delete.new(@uri.request_uri)
    when :head
      Net::HTTP::Head.new(@uri.request_uri)
  end
  prepare.body = options[:body].is_a?(Hash) ? options[:body].to_params : options[:body] if options[:body]
  prepare.basic_auth(options[:basic_auth][:username], options[:basic_auth][:password]) if options[:basic_auth]
  if options[:headers]
    options[:headers].each_pair do |key, value|
      prepare[key] = value
    end
  end
  prepare
end

Weary::Request#method= seems like it is doing a little bit too much work. Maybe I overlooked something but why not just do http_verb.to_s.strip.downcase.intern or something to get the verb? Also, Weary::Resource#via= seems to do the same thing. Maybe you need another class for this logic or a shared method somewhere? You could have something like this:

HTTPVerb.new(http_verb).normalize

HTTPVerb#normalize would then figure out which method to return and could be reused in the places you perform that. Also, you can test it separately and then not worry about testing the different verb mutations in the method= tests.

Here are the two methods I was talking about from the Request and Resource classes.

# Request#method=
def method=(http_verb)
  @http_verb = case http_verb
    when *Methods[:get]
      :get
    when *Methods[:post]
      :post
    when *Methods[:put]
      :put
    when *Methods[:delete]
      :delete
    when *Methods[:head]
      :head
    else
      raise ArgumentError, "Only GET, POST, PUT, DELETE, and HEAD methods are supported" 
  end
end

# Resource#via=
def via=(http_verb)
  @via = case http_verb
    when *Methods[:get]
      :get
    when *Methods[:post]
      :post
    when *Methods[:put]
      :put
    when *Methods[:delete]
      :delete
    else
      raise ArgumentError, "#{http_verb} is not a supported method" 
  end
end

Weary::Response#format= looks just like Weary::Resource#format=. Same thing as above with the http verbs is what I’d recommend.

# Response#format=
def format=(type)
  @format = case type
    when *ContentTypes[:json]
      :json
    when *ContentTypes[:xml]
      :xml
    when *ContentTypes[:html]
      :html
    when *ContentTypes[:yaml]
      :yaml
    when *ContentTypes[:plain]
      :plain
    else
      nil
  end
end

# Resource#format=
def format=(type)
  type = type.downcase if type.is_a?(String)
  @format = case type
    when *ContentTypes[:json]
      :json
    when *ContentTypes[:xml]
      :xml
    when *ContentTypes[:html]
      :html
    when *ContentTypes[:yaml]
      :yaml
    when *ContentTypes[:plain]
      :plain
    else
      raise ArgumentError, "#{type} is not a recognized format." 
  end
end

Break Big Methods into Classes with Tiny Methods

Weary#craft_methods is doing a lot. I understand generally what you are trying to do but without digging in, it is hard to tell. I’d break that out into another class, maybe MethodCrafter. Then, each of those if and unless statements could be moved into their own methods and would be easier to test. MethodCrafter.code could return the code to be eval’d. I use to make long methods, but lately I’ve found breaking them out into classes makes things easier to digest and test.

I have talked about tiny methods before as well. Here is the code for the craft_methods method that I recommended moving to a class.

def craft_methods(resource)
  code = %Q{
    def #{resource.name}(params={})
      options ||= {}
      url = "#{resource.url}" 
  }
  if resource.with.is_a?(Hash)
    hash_string = "" 
    resource.with.each_pair {|k,v| 
      if k.is_a?(Symbol)
        k_string = ":#{k}" 
      else
        k_string = "'#{k}'" 
      end
      hash_string << "#{k_string} => '#{v}'," 
    }
    code << %Q{
      params = {#{hash_string.chop}}.delete_if {|key,value| value.empty? }.merge(params)
    }
  end
  unless resource.requires.nil?
    if resource.requires.is_a?(Array)
      resource.requires.each do |required|
        code << %Q{  raise ArgumentError, "This resource requires parameter: ':#{required}'" unless params.has_key?(:#{required}) \n}
      end
    else
      resource.requires.each_key do |required|
        code << %Q{  raise ArgumentError, "This resource requires parameter: ':#{required}'" unless params.has_key?(:#{required}) \n}
      end
    end
  end
  unless resource.with.empty?
    if resource.with.is_a?(Array)
      with = %Q{[#{resource.with.collect {|x| x.is_a?(Symbol) ? ":#{x}" : "'#{x}'" }.join(',')}]}
    else
      with = %Q{[#{resource.with.keys.collect {|x| x.is_a?(Symbol) ? ":#{x}" : "'#{x}'"}.join(',')}]}
    end
    code << %Q{ 
      unnecessary = params.keys - #{with} 
      unnecessary.each { |x| params.delete(x) } 
    }
  end
  if resource.via == (:post || :put)
    code << %Q{options[:body] = params unless params.empty? \n}
  else
    code << %Q{
      options[:query] = params unless params.empty?
      url << "?" + options[:query].to_params unless options[:query].nil?
    }
  end
  unless (resource.headers.nil? || resource.headers.empty?)
    header_hash = "" 
    resource.headers.each_pair {|k,v|
      header_hash << "'#{k}' => '#{v}'," 
    }
    code << %Q{ options[:headers] = {#{header_hash.chop}} \n}
  end
  if resource.authenticates?
    code << %Q{options[:basic_auth] = {:username => "#{@username}", :password => "#{@password}"} \n}
  end
  unless resource.follows_redirects?
    code << %Q{options[:no_follow] = true \n}
  end
  code << %Q{
      Weary::Request.new(url, :#{resource.via}, options).perform
    end
  }
  class_eval code
  return code
end

As you can see, that method bears a heavy burden. Also, the method is actually declared as private which means it is even harder to test (I won’t get into testing private methods right now). If this was broken out into an object, you could heavily unit test that object and then craft_methods could look more like this:

def craft_methods
  code = MethodCrafter.new(resource).to_code
  class_eval code
  return code
end

Unless vs. If

Weary::Resource#with= unless is kind of a brain twister. If you have an else, just use if and reverse the conditionals. I have talked about unless before.

def with=(params)
  if params.is_a?(Hash)
    @requires.each { |key| params[key] = nil unless params.has_key?(key) }
    @with = params
  else
    unless @requires.nil?
      @with = params.collect {|x| x.to_sym} | @requires
    else
      @with = params.collect {|x| x.to_sym}
    end
  end
end

Overall Reactions

So those are the specifics. Now to the more general reactions. You seem to care about your code and that is important. I see a bit of HTTParty in there and I think that is a good call. One of the best ways to learn in coding is to copy. I’ve stolen from lots of projects. 🙂

As far as the API for weary, I find it a bit over the top. When you are creating a code API that another programmer will use, you have to balance readability and verbosity. on_domain and as_format read nice but could be just as effective named domain and format which saves a few characters, an underscore and having to remember which is on, as, construct, with, and set. Mark took this advice already and changed the API.

I think the method builders that take a block (get, post, etc.) are interesting and I’m sure you learned a lot creating the project, which is the most important thing. I’m betting some people will like this better than HTTParty as everyone has different brains. Great work.

Conclusion

I found reviewing the code fun and was surprised by how many comments I had for Mark. I guess I have messed up a lot over the years and that has given me an opinion on this stuff. Hope others find it helpful. Let me know if you would like to see more posts like this.

MongoMapper, The Rad Mongo Wrapper

In which I formally release MongoMapper, a high level wrapper similar to ActiveRecord, but for MongoDB.

MongoMapper
A few weeks ago, I wrote about Mongo and how awesome it is. Towards the end of the article (and in the slideshow) I mentioned MongoMapper, a project I’ve been working on.

Over the past few weeks my buddies at Squeejee and Collective Idea have started using MongoMapper and they’ve helped me squash a few bugs and add a few features.

Despite the fact that I would call it far from finished, I’ve decided to release it in hopes that people can start playing with it, finding bugs, adding features and submitting pull requests. The documentation is sparse to none, but there are plenty of tests and the code is pretty readable, I believe.

Installation

# from gemcutter
gem install mongo_mapper

Usage

So how do you use this thing? It’s pretty simple. MongoMapper uses a default connection from the Ruby driver. This means if you are using Mongo on the standard port and localhost, you don’t have to give it connection information. If you aren’t, you can do it like this:

MongoMapper.connection = Mongo::Connection.new('hostname')

Connection accepts any valid Mongo ruby driver connection. The only other setup you need to do is to tell MongoMapper what the default database is. This is pretty much the same as setting up the connection:

MongoMapper.database = 'mydatabasename'

These two operations only define the default connection and database information. Both of these can be overridden on a per model basis so that you can hook up to multiple databases on different servers.

Include Instead of Inherit

To create a new model, I went with the include pattern, instead of inheritance. In ActiveRecord, you would define a new model like this:

class Person < ActiveRecord::Base
end

In MongoMapper, you would do the following:

class Person
  include MongoMapper::Document
end

Just like ActiveRecord, this makes assumptions. It assumes you have a collection named people. Oh, and the good news is you don’t need a migration for it. The first time you try to create a person document, the collection will be created automatically. Heck yeah! I mentioned that you can override the default connection and database on a per document level. If you need to do that, it would look like this:

class Person
  include MongoMapper::Document

  connection Mongo::Connection.new('hostname')
  set_database_name 'otherdatabase'
end

Defining Keys

Each document is made up of keys. Keys are named and type-casted so you know your data is stored in the correct format. Lets fill out our Person document a bit.

class Person
  include MongoMapper::Document

  key :first_name, String
  key :last_name, String
  key :age, Integer
  key :born_at, Time
  key :active, Boolean
  key :fav_colors, Array
end

Now that we have defined our schema, we can create, update and delete documents.

person = Person.create({
  :first_name => 'John',
  :last_name => 'Nunemaker',
  :age => 27,
  :born_at => Time.mktime(1981, 11, 25, 2, 30),
  :active => true,
  :fav_colors => %w(red green blue)
})

person.first_name = 'Johnny'
person.save

person.destroy
# or you could do this to destroy
Person.destroy(person.id)

Looks pretty familiar, eh? Where it made sense, I tried to stay close to ActiveRecord in API.

Validations

But wait you say, how do I validate my data? Well, you can do it pretty much the same way as ActiveRecord.

class Person
  include MongoMapper::Document

  key :first_name, String
  key :last_name, String
  key :age, Integer
  key :born_at, Time
  key :active, Boolean
  key :fav_colors, Array

  validates_presence_of :first_name
  validates_presence_of :last_name
  validates_numericality_of :age
  # etc, etc
end

But, if you find that a bit tedious as I do, you can use some shortcuts that I’ve added in.

class Person
  include MongoMapper::Document

  key :first_name, String, :required => true
  key :last_name, String, :required => true
  key :age, Integer, :numeric => true
  key :born_at, Time
  key :active, Boolean
  key :fav_colors, Array
end

Most of the validations from Rails are supported. I still need to build in support for validates_uniqueness of and some of the options that rails supports might not be right now, but it is a good first pass.

Callbacks

Did you hear that? I swear I just heard someone whisper about callbacks. Umm, yeah, we got that too. The good news? I just used ActiveSupport’s callbacks so they are identical to Rails and most of Rails defined callbacks are supported such as before_save and the like.

Embedded Documents

So the cool thing about Mongo is that you can embed documents in other documents. Let’s say our person has multiple addresses. To handle that, we would create an embedded address document to go along with our person document.

class Address
  include MongoMapper::EmbeddedDocument

  key :address, String
  key :city,    String
  key :state,   String
  key :zip,     Integer
end

class Person
  include MongoMapper::Document

  many :addresses
end

Now we can add addresses to the person like so:

person = Person.new
person.addresses << Address.new(:city => 'South Bend', :state => 'IN')
person.addresses << Address.new(:city => 'Chicago', :state => 'IL')
person.save

Doing this actually saves the address right inside the person document. Yep, no joins. Yay! Cheers resound from the heavens! You can even query for documents based on these embedded documents. For example, if you wanted to find all people that are in the city Chicago, you could do this:

Person.all(:conditions => {'addresses.city' => 'Chicago'})

Finding Documents

The find API is very similar to AR as well. Below are a bunch of other examples:

Person.find(1)
Person.find(1,2,3,4)
Person.find(:first)
Person.first
Person.find(:last)
Person.last
Person.find(:all)
Person.all
Person.all(:last_name => 'Nunemaker', :order => 'first_name')

For more information about how to provide criteria to find, you can see the stuff covered in the finder options. If you need to, you can even throw custom mongo stuff into the mix and it just gets passed through to the mongo ruby driver (ie: $gt, $gte, $lt, $lte, etc.).

Conclusion

We take ActiveRecord for granted. It really has a lot of handy features and does a pretty good job at modeling our applications. I never realized how much it does, until I decided to create MongoMapper. That said, the experience has been fun thus far and I’m excited to see what people use it for.

There is a ton more I could talk about, but frankly, this article is long enough. Rest assured that I think Mongo is cool and that MongoMapper is headed in the right direction, but far from complete. I haven’t actually built anything with MongoMapper yet, but I will be soon. I’m sure that will lead to a lot of handy new features.

Any general discussion can happen in the comments below while they are open or over at the google group. If you find a bug or have a feature idea, create an issue at github.

JSONQuerying Your Rails Responses

In which I show how to use a Ruby implementation of JSONQuery to test JSON in Rails apps.

I’m writing an application right now that is really JSON heavy. Some of the functional tests are cucumber and some of them are just rails functional tests using shoulda.

I hit a point today where I wanted to verify that the JSON getting output was generally what I want. I could have just JSON parsed the response body and compared that with what I was looking for, but a little part of me thought this might be a cool application of JSONQuery.

JSONQuery provides a comprehensive set of data querying tools including filtering, recursive search, sorting, mapping, range selection, and flexible expressions with wildcard string comparisons and various operators.

The quote above is fancy and can be boiled down to “a query language for JSON”. If you want to read more about JSONPath and JSONQuery here are some posts:

Finding a Ruby JSONQuery Implementation

I knew there was a JavaScript implementation of JSONQuery and that Jon Crosby has been doing some cool stuff with it in CloudKit, but I couldn’t find a Ruby implemenation that didn’t require johnson.

After some googling and Github searching, I came across Siren. Siren was pretty much what I wanted, so I started playing around with it. I forked it, gem’d it and wrapped it with some shoulda goodness.

What I ended up with was pretty specific to my needs at the moment, but I post it here in hopes that it sparks some ideas.

Bringing It All Together

First, I added the following to my environments/test.rb file.

config.gem 'jnunemaker-siren',
            :lib     => 'siren',
            :version => '0.1.1',
            :source  => 'http://gems.github.com'

Then I added the following in my test helper (actually put it in separate module and file and included it but I’m going for simplicity in this post).

class ActiveSupport::TestCase
  def self.should_query_json(expression, string_or_regex)
    should "have json response matching #{expression}" do
      assert_jsonquery expression, string_or_regex
    end
  end

  def assert_jsonquery(expression, string_or_regex)
    json = ActiveSupport::JSON.decode(@response.body)
    query = Siren.query(expression, json)

    if string_or_regex.kind_of?(Regexp)
      assert_match string_or_regex, query, "JSONQuery expression #{expression} value did not match regex" 
    else
      assert_equal string_or_regex, query, "Expression #{expression} value #{query} did not equal #{string_or_regex}" 
    end
  end
end

The code is quick and dirty. The first thing you’ll notice is that assert_jsonquery actually uses @response.body, which means it can only be used in a controller test. I could easily expand it, but, like I said above, I just got it working for what I needed right now. The cool part is that now in my functional tests, I can do stuff like this:

context "on POST to :create" do
  setup { post :create, :status => {'action' => 'In', 'body' => 'Working on PB' }

  # ... code removed for brevity ...
  should_query_json "$['replace']['#current_status']", /Working on PB/
end

That is a really basic query. Trust me you can do a heck of a lot more. Check out the Siren tests if you don’t believe me.

Conclusion

Overall, working with siren was a little rough because I wasn’t familiar with JSONQuery syntax. Also, siren tends to return nil instead of a more helpful error message about my expression compilation failing, but I’m kind of excited to see how this works out in the long run.

What are you doing to test JSON in your apps? Does something like this seem cool or overkill? Just kind of curious.

What Is The Simplest Thing That Could Possibly Work?

In which I summarize my favorite points from an old, but awesome article.

I am always amazed when I read an article from 2004 and find interesting goodies. I’m probably late to the game on a lot of these articles, as I didn’t really dive into programming as a career until 2005, but I just read The Simplest Thing that Could Possibly Work, a conversation with Ward Cunningham by Bill Venners. The article was published on January 19, 2004, but it is truly timeless.

The Shortest Path

Simplicity is the shortest path to a solution.

“Shortest” doesn’t necessarily refer to lines of code or number of characters, but I see it more as the path that requires the least amount of complexity. As he mentions in the article, if someone releases a 20 page proof to a math problem and then later on, someone releases a 10 page proof for the same problem, the 10 page proof is not necessarily more simple.

The 10 page proof could use some form of mathematics that is not widely used in the community and takes some time to comprehend. This means the 10 page version could be less simple as it requires learning to understand, whereas the 20 page uses generally understood concepts.

I think this is a balance that we always fight with as programmers. What is simple? I can usually say simple or not simple when I look at code, but it is hard to define the rules for simplicity.

Work Today Makes You Better Tomorrow

The effort you expend today to understand the code will make you a more powerful programmer tomorrow.

This is one of the concepts that has made the biggest different in my programming knowledge over the past few years. The first time that I really did this was when I wrote about class and instance variables a few years back. Ever since then, when I come across something that I don’t understand, that I feel I should, I spend the time to understand it. I have grown immensely because of this and would recommend that you do the same if you aren’t already.

Narrow What You Think About

We had been thinking about too much at once, trying to achieve too complicated a goal, trying to code it too well.

This is something that I have been practicing a lot lately. You know how sometimes you just feel overwhelmed and don’t want to start a feature or project? What I’ve found is that when I feel this way it is because I’m trying to think about too much at once.

Ward encourages over and over in the article, think about what is the most simple possible thing that could work. Notice he did not say what is the simplest thing that would work, but rather what could work.

This is something that I’ve noticed recently while pairing with Brandon Keepers. Both of us almost apologize for some of the code we first implement, as we are afraid the other will think that is all we are capable of. What is funny, is that we both realize that you have to start with something and thus never judge. It is far easier to incrementally work towards a brilliant solution than to think it in your head and instantly code it.

Start with a test. Make the test pass. Rinse and repeat. Small, tested changes that solve only the immediate problem at hand always end up with a more simple solution than trying to do it all in one fell swoop. I’ve also found I’m more productive this way as I have less moments of wondering what to do next. The failing test tells me.

Anyway, I thought the article was interesting enough that I would post some of the highlights here and encourage you all to read it. If you know of some oldie, but goodie articles, link them up in the comments below.

What If A Key/Value Store Mated With A Relational Database System?

In which I provide an intro to MongoDB and it’s awesomeness.

Last night, the folks from the Grand Rapids ruby group were kind enough to allow me to present on MongoDB. The talk went great. I’ve been excited about Mongo for a couple weeks now, so it was cool to see that it wasn’t just me.

The funny thing is, at nearly the same time, Wynn Netherland presented on MongoDB to the Dallas ruby group. We discovered that he wrote part 1 and I wrote part 2 of the presentation despite not working together on it so we ended up showing each other’s slides as well.

I figured since I spent the time to throw some slides together, I might as well put an intro up here too. First, the slides (they probably won’t mean a lot as they were mostly outlines for me to speak from).

<object height=”355″ width=”425″><param /><param /><param /><embed src=”http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=mongodbgrandrapids-090603001046-phpapp01&#38;stripped_title=mongodb-grand-rapids-rug” height=”355″ width=”425″></embed></object>

Intro to MongoDB

Ok, so what the crap is Mongo? I find the best way to describe Mongo is the best features of key/values stores, document databases and RDBMS in one. No way, you say. That sounds perfect. Well, Mongo is not perfect, but I think it brings something kind of new to the database table.

Mongo is built for speed. Anything that would slow it down (aka transactions) have been left on the chopping block. Instead of REST, they chose sockets and have written drivers for several languages (of course one for Ruby).

Collections

It is collection/document oriented. Collections are like tables in MySQL (they are even grouped in databases) and serve the purpose of breaking up the top level entities in your application (User, Article, Account, etc.) by type and thus into smaller query sets, to make queries faster.

Documents

Inside of each collection, you store documents. Documents are basically objects that have no schema. The lack of schema may be scary to some, but I look at it this way. You have to know your application schema at the app level, so why put the schema in the database and in your app. Why not just put the schema in your app and have the database store whatever you put in it? This way, you database schema is kind of versioned with your application code. I think that is pretty cool.

Documents are stored in BSON (blog post), which is binary encoded JSON that is built to be more efficient and also to include a few more data types than JSON. This means that if you send Mongo a document that has values of different types, such as String, Integer, Date, Array, Hash, etc., Mongo knows exactly how to deal with those types and actually stores them in the database as that type. This differs from traditional key/value stores, which just give you a key and a string value and leave you to handle serialization yourself.

Object Relationships

There are two ways to relate documents in Mongo. The first, is to simply embed a document into another document. An example of this would be tags embedded in article. Let’s take a look.

{
  title: 'Mongolicious', 
  body: 'I could teach you, but I would have to charge...', 
  tags: ['mongo', 'databases', 'awesome']
}

As you can see, tags are just a key in the article document. The benefits of this are that you never have to do any joins when you show the article and it’s tags as they are all stored in the same place. The other cool thing is that Mongo can index the tags and understand indexing keys that have multiple values (such as arrays and hashes). This means if you index tags, you can find all documents tagged with ‘foo’ and it will be performant. Embedded documents work great for some things, but other things wouldn’t make sense embedded.

Let’s imagine that you have an client document and you want the client to have multiple contacts. If you embedded the contacts for the client with it in a document, it would be inefficient to have a page that listed all the contacts. To have a contact list, you would have to pull out every client and collect all the contacts and then sort them. Also, if a contact should be associated with multiple clients, you would have to duplicate their information for each client.

In SQL, you would have a clients table and a contacts table and then a join model between them so that any contact would be in the system once and could be associated with one or more clients without duplicate. So how would you do this in Mongo? The same way…kind of.

In Mongo, you’d have a client collection and a contact collection. To associate a contact to a client, you just create a db reference to to the contact from the client.

Dynamic Queries

Yep, Mongo has dynamic queries. It actually has a kind of quirky, yet lovable syntax for defining criteria. Below are a few examples from my presentation which are mostly self-explanatory. These are examples of what you would run in Mongo’s JavaScript shell.

# finds all Johns
db.collection.find({‘first_name’: ‘John’})

# finds all documents with first_name 
# starting with J using a regex
db.collection.find({‘first_name’: /^J/}) 

# finds first with _id of 1
db.collection.find_first({‘_id’:1})

# finds possible drinkers (age > 21)
db.collection.find({‘age’: {‘$gt’: 21}})

# searches in embedded document author for 
# author that has first name of John
db.collection.find({‘author.first_name’:‘John’})

# worse case scenario, or if you need "or" 
# queries you can drop down to JavaScript
db.collection.find({$where:‘this.age >= 6 && this.age <= 18’})

You can also sort by one or more keys, limit the number of results, offset a number of results (for pagination), and define which keys you want to select. The other thing that is slick is Mongo supports count and group. Count is the same idea as MySQL’s count. It returns the number of documents that match provided criteria. Group is the same concept, but is accomplished with map/reduce.

To really get a feel for all that you can do with queries, check out Mongo’s advanced query documentation.

Random Awesomeness

  • Capped collections (blog post): Think memcache. You can set a limit for a collection to a certain number of documents or size of space. When the number or size goes over limit the old document gets pushed out. For more info, see MongoDB and Caching
  • Upserts: Think find or create in one call. You provide criteria and the document details and Mongo determines if the document exists or not and either inserts or updates it. You can also do special things like incrementers with $inc. For more, read Using mongo for real time analytics
  • Multikeys: for indexing arrays of keys. Think tagging.
  • GridFS and auto-sharding: Storing files in the database in a way that doesn’t suck. They have mentioned in IRC that they might even make Apache/Nginx modules that server files straight from GridFS so requests can go straight from web server to Mongo instead of traveling through your app server. For more, read You don’t need a file system

How do I use it with Ruby?

If you have made it this far, you are probably intrigued and are wondering how you can use Mongo with Ruby. There is an official mongo-ruby-driver on GitHub for starters. It supports most of Mongo’s features, if not all, and gets the job done, but it is really low level. It would be like writing an application using the MySQL gem. You can do it, but it won’t be fun. I’ve even started giving back to the driver.

There are two “ORM’s” for Mongo and both are on GitHub. The first is an ActiveRecord adapter and the second is MongoRecord. I took a look at both of these, and decided to write my own. Why?

  • Mongo is not a RDBMS (like MySQL) so why use RDBMS wrappers (like the AR adapter)?
  • I think the DSL for modeling your application should teach you Mongo.
  • Mongo is perfect for the website management system I’m building and I just didn’t like the other wrappers. Why would I want to build something with something that I didn’t like?
  • It sounded fun!

MongoMapper

I started the Friday of Memorial weekend and was able to crank out most of the functionality. Since then, I’ve been working on it whenever I get time and it is really close to being ready for a first release. That said, it is not public yet. Don’t worry, as soon as it is ready for prime time, I’ll be posting more here. So what features does MongoMapper have built in?

  • Typecasting
  • Callbacks (uses ActiveSupport callbacks)
  • Validations (uses my fork of validatable)
  • Connection and database can differ per document
  • Create, update, delete, delete_all, destroy, destroy_all that work just like ActiveRecord
  • Find with id, multiple ids, :all, :first, :last, etc. Also supports Mongo specific find critieria like $gt, $lt, $in, $nin, etc.
  • Associations
  • Drop in Rails compatibility

So out of the features listed above, all are complete but the last two at the time of this post. I’m currently working through associations and then I’m going to start making a Rails app with MongoMapper to figure out what I need for “drop in and forget” Rails compatibility. I have a few other smart people helping me so my guess is that it will be out in the next two weeks.

Let me know with a comment below what you like and don’t like about Mongo. I’m very curious what other Rails developers think after reading this intro and the articles I’ve linked to. I’m stoked, but I’m sure it is not for everyone.

Links

Swine Flu and the Twitter Gem

In which I wax poetic about the trendy new addition to the Twitter gem.

I had some extra time today and I’ve been spotty on open source work over the past few weeks, so I decided to add support for the Twitter trends API to my Twitter gem.

Using HTTParty, the code for this turned out to be insanely simple, so short, in fact, that I’ll just put it inline here so you don’t even have to go over to Github. Aww, I’m so nice.

module Twitter
  class Trends
    include HTTParty
    base_uri 'search.twitter.com/trends'
    format :json

    # :exclude => 'hashtags' to exclude hashtags
    def self.current(options={})
      mashup(get('/current.json', :query => options))
    end

    # :exclude => 'hashtags' to exclude hashtags
    # :date => yyyy-mm-dd for specific date
    def self.daily(options={})
      mashup(get('/daily.json', :query => options))
    end

    # :exclude => 'hashtags' to exclude hashtags
    # :date => yyyy-mm-dd for specific date
    def self.weekly(options={})
      mashup(get('/weekly.json', :query => options))
    end

    private
      def self.mashup(response)
        response['trends'].values.flatten.map { |t| Mash.new(t) }
      end
  end
end

Pure TDD

I am most definitely a tester, but I’ll admit I usually write code and then write the test. Of late, I’ve been reversing this trend and actually practicing TDD in full force by writing a small test, then only enough code to make it pass, followed by another test or more code for the existing test, finished with just enough code to make the new addition pass.

It is a different mindset to code in this way, compared to my code first and then make sure my butt is covered method and I’ve loving it. I thought I would find pure TDD tedious, but on the contrary, I think I’m coding faster and cleaner.

The Tests

So how did I test the code above? Again, inline for your viewing pleasure, are the tests I added to make sure I don’t break something in the future and get yelled at. Feel free to take a gander and I’ll meet back up with you at the bottom of it.

require File.dirname(__FILE__) + '/../test_helper'

class TrendsTest < Test::Unit::TestCase
  include Twitter

  context "Getting current trends" do
    should "work" do
      stub_get('http://search.twitter.com:80/trends/current.json', 'trends_current.json')
      trends = Trends.current
      trends.size.should == 10
      trends[0].name.should == '#musicmonday'
      trends[0].query.should == '#musicmonday'
      trends[1].name.should == '#newdivide'
      trends[1].query.should == '#newdivide'
    end

    should "be able to exclude hashtags" do
      stub_get('http://search.twitter.com:80/trends/current.json?exclude=hashtags', 'trends_current_exclude.json')
      trends = Trends.current(:exclude => 'hashtags')
      trends.size.should == 10
      trends[0].name.should == 'New Divide'
      trends[0].query.should == %Q(\"New Divide\")
      trends[1].name.should == 'Star Trek'
      trends[1].query.should == %Q(\"Star Trek\")
    end
  end

  context "Getting daily trends" do
    should "work" do
      stub_get('http://search.twitter.com:80/trends/daily.json?', 'trends_daily.json')
      trends = Trends.daily
      trends.size.should == 480
      trends[0].name.should == '#3turnoffwords'
      trends[0].query.should == '#3turnoffwords'
    end

    should "be able to exclude hastags" do
      stub_get('http://search.twitter.com:80/trends/daily.json?exclude=hashtags', 'trends_daily_exclude.json')
      trends = Trends.daily(:exclude => 'hashtags')
      trends.size.should == 480
      trends[0].name.should == 'Star Trek'
      trends[0].query.should == %Q(\"Star Trek\")
    end

    should "be able to get for specific date (with date string)" do
      stub_get 'http://search.twitter.com:80/trends/daily.json?date=2009-05-01', 'trends_daily_date.json'
      trends = Trends.daily(:date => '2009-05-01')
      trends.size.should == 440
      trends[0].name.should == 'Swine Flu'
      trends[0].query.should == %Q(\"Swine Flu\")
    end

    should "be able to get for specific date (with date object)" do
      stub_get 'http://search.twitter.com:80/trends/daily.json?date=2009-05-01', 'trends_daily_date.json'
      trends = Trends.daily(:date => Date.new(2009, 5, 1))
      trends.size.should == 440
      trends[0].name.should == 'Swine Flu'
      trends[0].query.should == %Q(\"Swine Flu\")
    end
  end

  context "Getting weekly trends" do
    should "work" do
      stub_get('http://search.twitter.com:80/trends/weekly.json?', 'trends_weekly.json')
      trends = Trends.weekly
      trends.size.should == 210
      trends[0].name.should == 'Happy Mothers Day'
      trends[0].query.should == %Q(\"Happy Mothers Day\" OR \"Mothers Day\")
    end

    should "be able to exclude hastags" do
      stub_get('http://search.twitter.com:80/trends/weekly.json?exclude=hashtags', 'trends_weekly_exclude.json')
      trends = Trends.weekly(:exclude => 'hashtags')
      trends.size.should == 210
      trends[0].name.should == 'Happy Mothers Day'
      trends[0].query.should == %Q(\"Happy Mothers Day\" OR \"Mothers Day\")
    end

    should "be able to get for specific date (with date string)" do
      stub_get 'http://search.twitter.com:80/trends/weekly.json?date=2009-05-01', 'trends_weekly_date.json'
      trends = Trends.weekly(:date => '2009-05-01')
      trends.size.should == 210
      trends[0].name.should == 'TGIF'
      trends[0].query.should == 'TGIF'
    end

    should "be able to get for specific date (with date object)" do
      stub_get 'http://search.twitter.com:80/trends/weekly.json?date=2009-05-01', 'trends_weekly_date.json'
      trends = Trends.weekly(:date => Date.new(2009, 5, 1))
      trends.size.should == 210
      trends[0].name.should == 'TGIF'
      trends[0].query.should == 'TGIF'
    end
  end
end

So, yeah, nothing earth shattering. It feels a bit repetitive, but I don’t mind some amount of repetition in my tests. The fixture files were created quite simply using curl.

cd test/fixtures
curl http://search.twitter.com:80/trends/weekly.json?date=2009-05-01 > trends_weekly_date.json
# rinse and repeat for each file

The stub_get method is a simple wrapper around FakeWeb and looks something like this:

def stub_get(url, filename, status=nil)
  options = {:string => fixture_file(filename)}
  options.merge!({:status => status}) unless status.nil?
  FakeWeb.register_uri(:get, url, options)
end

def fixture_file(filename)
  file_path = File.expand_path(File.dirname(__FILE__) + '/fixtures/' + filename)
  File.read(file_path)
end

I’m lazy and find that stub_get is much shorter than FakeWeb.register_uri blah, blah, blah. The tests use FakeWeb, shoulda and my fork of matchy, in case you are curious.

Example Uses

So what can you do with the new trends addition? Below are some examples of how you can obtain trend information.

Twitter::Trends.current
Twitter::Trends.current(:exclude => 'hashtags')

Twitter::Trends.daily # current day
Twitter::Trends.daily(:exclude => 'hashtags')
Twitter::Trends.daily(:date => Date.new(2009, 5, 1))

Twitter::Trends.weekly # current day
Twitter::Trends.weekly(:exclude => 'hashtags')
Twitter::Trends.weekly(:date => Date.new(2009, 5, 1))

That’s all for now. Enjoy the new trends and build something cool. Oh, and if you want to play with trends, but don’t have an idea, I have one and most likely won’t have time to build it. I’d be happy to collaborate.