Generate an Apple Pay CSR with OpenSSL

When creating an Apple Pay certificate signing request, Apple specifies that you need to use a 256 bit elliptic curve key pair. To generate both the private key and the CSR using the openssl command line utility, do the following:


$ openssl ecparam -out private.key -name secp256k1 -genkey
$ openssl req -new -sha256 -key private.key -nodes -out request.csr -subj '/CN=yourdomain.com/O=Your Name or Company/C=US'

The result will be a private key file at private.key and the CSR in request.csr. They should look something like this.

private.key

-----BEGIN EC PARAMETERS-----
BgUrgQQACg==
-----END EC PARAMETERS-----
-----BEGIN EC PRIVATE KEY-----
MHQCAQEEID0Y+YLOz3ed+dMlh047WSwgxl3a0WVI4en3tjntAdwooAcGBSuBBAAK
oUQDQgAEwHGnT+kCI+oqFK8ALEZzBcqHC+QNwmCLQHx51zCT51TpZEIufTFpac3a
E5sNqznV2Dp39N0wVCAV7QPGI6SXvg==
-----END EC PRIVATE KEY-----
request.csr

-----BEGIN CERTIFICATE REQUEST-----
MIH3MIGgAgEAMEExGTAXBgNVBAMTEHd3dy5zcHJlZWRseS5jb20xFzAVBgNVBAoT
DlNwcmVlZGx5LCBJbmMuMQswCQYDVQQGEwJVUzBWMBAGByqGSM49AgEGBSuBBAAK
A0IABMBxp0/pAiPqKhSvACxGcwXKhwvkDcJgi0B8edcwk+dU6WRCLn0xaWnN2hOb
Das51dg6d/TdMFQgFe0DxiOkl76gADAJBgcqhkjOPQQBA0cAMEQCIBGy+OBbsjey
lQhqezpSRt+IKfMMLdA78Pnck3fWIVxcAiBOYX1hmOREEysFQq0eX309iY0uZ3dm
MRDa/83lW8GcZQ==
-----END CERTIFICATE REQUEST-----

You will need the private key to decrypt the Apple Pay cryptogram so keep it in a secure place. The CSR you will then upload to Apple, which will generate a certificate for you to use in your iOS app.

Normally, your payment provider will generate the private key (which they will retain) and CSR for you, so doing this yourself won’t be necessary. However, if you’re a payment provider like Spreedly or are interested in managing more of the payment flow, this may be useful.

Exposing a Javascript API in a Web Page with Browserify

When using Browserify to build and resolve your javascript library dependencies, it’s very easy to get a resulting bundle.js file that you can include in a browser with a simple script tag.


<script src="/assets/bundle.js"></script>

Including the bundle in this way will execute the code in your entrypoint (often main.js) which is where most online tutorials end and which might be all you need. However, how do you create a bundle that will expose an API to the including page? So you can do something like:


<script src="/assets/bundle.js"></script>
<script>
var lib = new MyLibrary();
</script>

You need to specify the standalone option to the browserify command (or API), which will export the API you expose in your entrypoint file in the given namespace. Here’s an example.

Given a library file where your main library functionality is defined:

lib.js

var _ = require('underscore');

module.exports = MyLibrary;

function MyLibrary() {
  this.aSetting = true;
};

MyLibrary.prototype.doWork = function() {
  console.log(this.aSetting);
}

And your browserify entrypoint which exposes your library API:

main.js

module.exports = require('./lib')

At this point, if you were to build a bundle with browserify and include it in a web page, you wouldn’t be able to access MyLibrary. Since browserify makes sure everything is local scoped, and the web page doesn’t know how to deal with your top level export, it’s effectively hidden. The solution is to tell browserify to expose your exports with the standalone option.

As a command it looks like this:


$ browserify main.js --standalone MyLibrary > bundle.js

And if you’re using Gulp or similar the API options looks like:


browserify('./main.js', {
  standalone: 'MyLibrary'
})

With this resulting bundle you can now reference MyLibrary from the including html page.

index.html

<script src="/assets/bundle.js"></script>
<script>
var lib = new MyLibrary();
lib.doWork();
</script>

Time-Series Database Design with InfluxDB

Here at Spreedly we’ve recently started using the time series database InfluxDB to store a variety of customer activity metrics. As with any special purpose database, using and designing for a time-series database is quite different than what you may be used to with structured (SQL) databases. I’d like to describe our experience designing our InfluxDB schema, the mistakes we made, and the conclusions we’ve come to based on those experiences.

The mark

Consider the following scenario, closely resembling Spreedly’s: You run a service that lets your customers transact against a variety of payment gateways. You charge for this service on two axes – by the number of gateways provisioned and the number of credit cards stored. For any point in time you want to know how many of both each of your customers has for their account.

Initially we setup two series (InfluxDB’s term for a collection of measurements, organizationally similar to a SQL database table) to store the total number of each item per account:

  • gateway.account.sample
  • payment-method.account.sample

On some regular interval we’d collect the number of gateways and payment methods (credit cards) for each account and store it in the respective series. Each measurement looked like:

gateway.account.sample

{
  "time": 1400803300,
  "value": 2,
  "account_key": "abc123"
}

time is required by InfluxDB and is the epoch time of the measurement. The value is our value of the measurement at that time, and account_key is an additional property of that measurement.

Simple enough. This approach felt good and we went to production with this schema. That’s when we learned our first lesson…

Time-scoped queries

The first app that used the data in InfluxDB was our customer Dashboard product. It displays all your transactions and a simple view of your current billing counts (number of gateways and number of stored payment methods). Dashboard simply queried for the most recent measurement from each series for the current account:


select value
  from gateway.account.sample
  where account_key = 'abc123'
  limit 1
Since results in InfluxDB are ordered by default most recent first, the limit 1 clause ensures only the most recent measurement is returned for that customer (account).

All was fine initially, but as our dataset grew into the hundreds of thousands entries for each series we noticed our queries were taking quite some time to complete – about a constant 5s for every account. It turns out that these queries were incurring a full table scan, hence the constant (poor) performance.

Avoid a full table scan by always time-scoping your queries

In InfluxDB, the non-time fields aren’t indexed, meaning any queries that filter based on them require a full table scan (even if you’re only fetching a single result). The way to avoid a full table scan is to always time-scope your queries. Knowing this we modified our queries to only query against the previous 2 days worth of data (enough time to capture the most recent input):


select value
  from gateway.account.sample
  where time > now() - 2d
    and account_key = 'abc123'
  limit 1

Adding the where time > now() - 2d clause ensures that the query operates against a manageable set of data and avoids a full table scan. This dropped our query times from 5s (and growing) down to a steady 100ms – 200ms. (Keep in mind this is a remote instance of InfluxDB, meaning the bulk of that is in connection setup and network latency.)

InfluxDB response time reduction using time-scoped queries. Y-axis truncated for maximum obfuscation.

Obviously your use-case may differ wildly from ours. If your data is collected at unknown intervals, or in real-time, you don’t have the luxury of limiting your queries to a known window of time. In these situations it is wise to think about how to segment your data into series for optimal performance.

Series granularity

How many series should you have? How much data should you store in each series? When should you break out queries into their own series? These are all common questions when designing your time-series schema and, unfortunately, there is no concrete right or wrong answer. However, there are some good rules of thumb to keep in mind when structuring your data.

Continuing from our previous example: We were now using time-scoped queries to get the total number of gateways and cards for each account. While we were seeing good performance, each query was operating against a single series that contained data for all accounts. The query’s account_key condition was responsible for filtering the data by account:


select value
  from gateway.account.sample
  where time > now() - 2d
    and account_key = 'abc123'
  limit 1

As even this already time-scoped set of data grows, querying against a non-indexed field will start to become an issue. Queries whose conditions eliminate a large percentage of the data within the series should be extracted out into their own series. E.g., in our case we have a query that gets a single account’s count of stored gateways to the exclusion of all the other accounts. This is an example of a query that filters out the majority of the data in a series and should be extracted so each account has its own series.

Series are cheap. Use them liberally to isolate highly conditional data access.

If you’re coming from a SQL-based mindset, the thought of creating one series per account might seem egregious. However, it’s perfectly acceptable in time-series land. So that’s what we did – we starting writing data from each account into its own series (with each series’ name including the account key). Now, when querying for an account’s total number of stored gateways we do:


select value
  from account-abc1234.gateway.sample
  where time > now() - 2d
    ...

Since you have to know the key in question to access the right series, this type of design is most common with primary (or other well-known keys). But… not only can series be segmented by key, segmenting by time period is also possible. While not useful in our specific situation, you can imagine segmenting data into monthly series, e.g., 201407.gateway.sample or some other period, depending on your access pattern.

You can imagine…

Multi-purpose data

At this point your series are lean and efficient, well-suited for accessing a single type of query and data. However, sometimes life isn’t that clean and you have one set of data that needs to be accessed in many different ways.

For instance, at Spreedly, we’d like to have a business-level set of metrics available that shows the total number of gateways and payment-methods across all customers. We could just dump summary-level data into a new series (not a terrible idea), but we’re already collecting this data on a customer-level. It’d be nice not to have to do two writes per measurement.

Use continuous queries to re-purpose broad series by access pattern

Fortunately, InfluxDB has a feature called continuous queries that lets you modify and isolate data from one series into one or more other dependent series. Continuous queries are useful when you want to “rollup” time-series data by time period (e.g., get the 99th percentile service times across 5, 10 and 15 minute periods) and also to isolate a subset data for more efficient access. This latter application is perfect for our use-case.

To use continuous queries to support both summary and account-specific stats we need to create the parent series that contain measurements for each account.

gateway.account.sample

{
  "time": 1400803300,
  "value": 2,
  "account_key": "abc123"
},
{
  "time": 1400803300,
  "value": 7,
  "account_key": "def456"
}

We can access this series directly to obtain the business-level stats we need across all customers:


select sum(value)
  from gateway.account.sample
  where time > now() - 1d

With continuous queries we can also use this parent series to spawn several “fanout” queries that isolate the data by account (replicating the account-specific series naming scheme from earlier):


select value
  from gateway.account.sample
  into account-[account_key].gateway.sample;

Notice the [account_key] interpolation syntax? This creates one series per account and stores the value field from each measurement into the new account-specific series (retaining the original measurement’s time):

account-abc123.gateway.sample

{
  "time": 1400803300,
  "value": 2
}
account-def456.gateway.sample

{
  "time": 1400803300,
  "value": 7
}

With this structure we:

  • Only write the data one time into the parent series gateway.account.sample
  • Can perform summary level queries against this parent series
  • Have access to the highly efficient, constantly updated, account-specific, data series account-def456.gateway.sample etc…

This is a great use of fanout continuous queries. Also available are regular continuous queries which operate by precomputing expensive group by queries. I’ll skip over them for now since we’re not yet using them at Spreedly, but I encourage you to look at them for your use cases.

Naming and structure

Series naming and packet structure is a tough topic due to personal preferences, differences in client languages and highly varied access patterns. I’m not going to label the following as best-practices, instead I’ll present what we’ve found at Spreedly, our motivations, and let you decide whether it makes sense for you to apply.

  • Come up with a naming structure that conveys both the purpose of the series and the type of data contained within. At Spreedly it’s something like (and still evolving): [key].measured-item.[grouping].measurement-type. For instance, the series that contains the count of all gateways stored by account is gateway.account.sample. The account-specific version is: account-abc123.gateway.sample. The measurement-type component is highly influenced by the l2met logging conventions and deserves further discussion.

    • count series record a specific number of times something happened in a specific period of time as an integer. Counts can be summed with other counts in the same series to perform time-based aggregations (rollups). Measured number of requests or transactions per minute are an example of a count series.
    • sample series take a point in time measurement of some metric that supercedes all previous samples of the same series. Sum totals are a good example of this type of series, e.g., total revenue to date, or total number of payment methods. With each measurement in the series, previous measurements are no longer relevant, though they may still be used to track trends over time.
    • measure series are similar to count series except that instead of being a simple representation of the number of times something happen, they can represent any unit of measure such as ms, Mb etc… Measurements are be mathmatically operable and can be summed, percentiled, averaged etc… CPU load and response times are examples of measure series.
  • Often there is a single value that represents the thing being measured, with the rest of the fields being meta-data or conditions. To facilitate re-usable client parsing we’ve found it nice to use the same field name across all series to represent the value of the measurement. Unsurprisingly, we chose value. All our series data contains a value field that contains the measurement value. This makes it easy to retrieve, which is especially useful in queries that select across multiple series or even merge results from multiple series into a single result set.

There’s a lot of subjectivity that goes into database design, independent of the storage paradigm. While SQL has been around for awhile and has well-known patterns, alternative databases, including time-series databases, are a bit more of a wild west. I’m hoping that by sharing our experiences we can prevent some common mistakes, freeing you up to create all new ones of your own!

Many thanks to Paul and Todd and the rest of InfluxDB for their tireless guidance on the subject

The New Gist: What It Is and What It Could Be

Gist is an incredible tool by Github for quickly sharing code, text and files. It has syntax highlighting and rendering for a huge number of programming languages including Markdown for text. For many techies, including myself, Gist is an indispensable tool for quickly sharing code and content with coworkers.

Gist has been around for several years now and, when compared with the pace of development on the main Github.com property, has been relatively neglected. Thankfully, Github recently updated Gist with a fresh new codebase and UI. As a heavy user of Gist I have some thoughts on this update, where it hits the mark and where it’s still lacking.

Search has long been sorely needed in Gist. It is not uncommon for power users to have several hundred to thousands of gists and the previous linear list-view based on creation date was inadequate. Immediate recall of a gist based on a search query was the primary use-case I had in mind when creating Gisted – a tool to quickly search and access all your gists. So, seeing a native Gist search feature was very welcome for me.

Unfortunately, it leaves a bit to be desired. Firstly the search is case sensitive. So searching for proposal is not the same as Proposal. When searching my gists I never care about the case and want to just quickly find the most relevant gist containing that term. Fortunately, I imagine this to be a very easy fix on Github’s end and expect it will be remedied shortly (based on nothing but my intuition).

Indexing

However, more fundamentally, search seems to only apply to the description of your gist (gists don’t have titles – the closest thing is what is labeled as the description). While I try to be very conscious of creating meaningful descriptions, when I search for them I often use some distinct term from within the gist content itself. Searching only based on description is like searching Google only based on the title of the web page.

Consider the results from the new gist search for a term I know exists: dev

Only one result? I think not. Now against descriptions and file contents:

There are lots of relevant results missed by Gist’s search mostly due to the lack of content indexing. Search can be a powerful utility for Gists but it needs some indexing refinement yet.

Advanced operators

Relevant but basic search is a must-have for most users. Search with filtering and other operators is a must-have for power users. For instance, filtering by owner is a great way to quickly list the gists you’ve starred by others. I like this implemented with the @ prefix notation and use it frequently:

The new Gist doesn’t seem to have filtering or any advanced features like phrases ("exactly this") or operators (AND, -). These tend to be built-in features of any search index so I imagine Github is easing into search and will turn these on once they feel comfortable with the infrastructure.

Lists

In the old Gist you really only had one way to view your gists. In a list of your created gists or your starred gists, ordered by when they were created. This was incredibly limiting. The new gist ushers in several refinements including:

  • The ability to sort gists by their updated date. However, this value is not sticky so I find myself always having to select it when I just want it as my default.
  • A much better partial rendering of each gist in the list, allowing you see more of the gist to know which one is the one you’re looking for.

These are just a few of the things that make using Gist a little more enjoyable than the old interface.

Revisions

Although gists have always been backed by a full git repo you didn’t see much of that benefit in the web UI. You had to clone the repo locally to see version diffs and manually fetch other remotes to compare and merge forked versions.

The new Gist takes a small step to solving this puzzle, allowing you to view diffs between your gist’s revisions.

However, it still doesn’t provide an easy way to view diffs or perform merges across forks. These are key collaboration features that would remove significant friction from using gists now. I can only hope Github is seriously thinking about enhancing this aspect of the product.

Content focus

The new Gist has a somewhat awkward focus on the gist files rather than the descriptions. I say awkward because, while I appreciate the direction of putting the content front and center, there are some artifacts that betray the intent.

For instance, when viewing a gist in a list it’s the filename of the gist that gets top billing:

However, gists can have multiple files making it an odd decision to choose only one (the first) to key off of.

Additionally, if files/content are the focus, they should be a first-class citizen in search but are instead ignored (as previously discussed).

Still missing

Some miscellaneous features I was hoping would be added in the new Gist include:

  • A resurrection of comment notifications! Gisted will do this for you, but it really should be a natively supported feature.
  • Markdown editing and rendering parity with Github proper. If you look at your standard project README on Github you’ll notice you can edit the Markdown inline with decent highlighting, preview your content and, when rendered, sections are automatically anchored. Markdown is such a core feature of text-based collaboration that parity here is essential.
  • A de-emphasis of the public gist-stream (now called “Discover Gists”). I just don’t see the value of randomly browsing new gists and think that real estate could be better used.

Summary

The new Gist is definitely an improvement over the old one. However I find it mostly just polishes existing features and doesn’t directly address some of larger issues.

I would encourage Github to focus of the main uses of Gist. From my perspective, gists are used mainly as a collaboration tool. While they’re backed by a full git repo, that is mostly an implementation detail. Commenting, managing collaborator modifications, and finding gists across several sources should be well supported use-cases.

I suspect we’ll see the pace of development on Gist quicken now that a new codebase is in place. Removing technical debt often removes roadblocks that may have prevented a product from evolving. I can only hope if I revisit this post several months from now I’ll have to significantly edit some of my more critical points.

Github has been incredibly supportive in my use of the Gist API and my work with them on this front only reinforces their developer-focused reputation. They’ll get Gist right, it’s just a matter of time.

Given my dependence on Gist for work I have a vested interest in its success. Any critical points made here were done so only in hopes of seeing it evolve into a better product.

Configuring CloudFlare DNS for a Heroku App

CloudFlare is a popular, and accessible, CDN and website optimizer. If you’ve heard of Akamai then you know basically what CloudFlare does – they sit between your site and your users and accelerate your site’s content by edge-caching and other nifty on-the-fly techniques. CloudFlare also offers additional availability and security features to automatically handle DDoS and other real-world problems.

Heroku needs no introduction other than to say it’s the best place to deploy your applications.

So, how do you get the benefits of CloudFlare for your Heroku application? Until CloudFlare provides a Heroku add-on there’s a bit of manual configuration that needs to occur.

For the purpose of this post I’m assuming you already have a CloudFlare account and an existing Heroku app.

Initial setup

Since CloudFlare needs to be able to handle DDoS and other traffic-related events on your behalf, it must serve as your DNS. When you add a new website to CloudFlare it will scan your existing DNS records and duplicate them in the CloudFlare DNS.

Initial DNS configuration

While this is a great way to quickly bootstrap your DNS, it implements the DNS anti-pattern of using A-records to resolve to a dynamically determined IP address.

Under the covers Heroku uses multiple IP addresses. Choosing just one to bind to is a dangerous practice that can adversely affect your app’s availability. In short, you should never use A-records in your DNS on Heroku because those static IP addresses can change at any time and represent a single point of failure.

Avoid the use of A-records and root domains (ryandaigle.com is a root domain whereas www.ryandaigle.com is not) by redirecting all root domain ryandaigle.com requests to www.ryandaigle.com.

Root domain redirect

Setting up a URL redirect (or “forward” in many DNS providers’ parlance) on CloudFlare requires that you go into the “Page rules” for your site.

From your CloudFlare websites list click on the gears icon for your site and select “Page rules”.

Website settings

If you don’t see “Page rules” as an option your site may not be fully configured. Complete the CloudFlare setup first or go to the CloudFlare settings page and under “Cache Purge” you will see a link to “Page rules”.

Enter the root domain for your site and the www (or other) sub-domain to redirect to. Append a * wildcard pattern to the root domain and the $1 regex to the sub-domain so all requests made to the root domain are properly forwarded (e.g. ryandaigle.com/a/mypage will get forward to www.ryandaigle.com/a/mypage). You’ll need to turn “on” the forwarding toggle to see the sub-domain field.

Forwarding rule

Make sure you include the http:// part of the sub-domain URL. Click “Add rule” to save the forward.

Sub-domains

If CloudFlare wasn’t able to retrieve your existing DNS settings, or you have a new Heroku app, you’ll need to make sure you have the proper CNAME DNS entries.

Map the www sub-domain to your Heroku app URL (appname.herokuapp.com) using a CNAME record.

CNAME entry

Confirmation

To confirm your setup, first verify that your root domain redirects to the sub-domain. Use the curl utility to verify the redirect.


$ curl -I ryandaigle.com
HTTP/1.1 301 Moved Permanently
...
Location: http://www.ryandaigle.com/

You should see a 301 Moved Permanently response code and the proper sub-domain URL in the Location header.

As you may already know, troubleshooting DNS is notoriously difficult given the propagation lag. In my testing it took about an hour for new CloudFlare DNS settings to take effect (and this is after CloudFlare’s name servers are active for your site).

After confirming the redirect you should also confirm that a sub-domain request passes through the CloudFlare system. Do this with a curl against the www sub-domain.


$ curl -I www.ryandaigle.com
HTTP/1.1 200 OK
Server: cloudflare-nginx
...
Set-Cookie: __cfduid=askdjfalk8a98a9sd8fa9sda9jkar8; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.ryandaigle.com

You can identify a CloudFlare-handled request by two response headers: the Server being set to cloudflare-nginx and a __cfduid cookie. If you see these two headers in the response then CloudFlare is properly handling your request. For my domain it took several hours to see these headers appear, so sleep on it if you’re not seeing this after configuring the DNS.

Conclusion

Once these DNS changes have propagated you might think it would be safe to remove the A-record. However, in my testing, you still need to keep the A-record listed in your CloudFlare DNS config or your hostname won’t resolve. The forwarding rule still works as desired and bypasses the A-record IP address but there must be an A-record listed.

If you don’t have an A-record already, add one from @ (the root domain notation) to one of the following IPs: 75.101.163.44, 75.101.145.87, 174.129.212.2.

At this point all requests to your root domain will be forwarded to their www equivalent which properly resolves to one of Heroku’s dynamically determined IP addresses. This is the appropriate setup for Heroku, and is the most robust configuration for any cloud-based environment.

Now that your DNS is properly configured I’d suggest browsing the CloudFlare app store and your site’s CloudFlare settings for all the cool toys and switches you now have at your disposal. They also have a good blog post for new users.

Using `heroku pg:transfer` to Migrate Postgres Databases

Development of most applications takes place in several disparate environments with the most common pattern being dev-staging-production. While it’s necessary for the source versions in each environment to differ it is quite useful to retain some level of data synchronicity between the environments (for example, to populate your local database with production data to diagnose a bug).

When managing environments on Heroku the recommendation has been to use Taps and heroku db:pull/heroku db:push to transfer data to and from the remote Postgres database. While Taps aimed to be database-agnostic (allowing you to import/export between different database vendors) this came at the expense of robustness and maintainability. The fragility of the tool is evident on Stackoverflow.

The Heroku pg:transfer CLI plugin is a more stable and predictable tool that automatically transfers data between two Postgres databases using native pg tools and protocols. You should consider heroku pg:transfer an immediate replacement for heroku db:pull.

Install

As more developers adopt the Twelve-Factor tenet of environment parity the need to perform data migrations across database vendors is eliminated. This allows the ability to use a database’s native import/export tools, resulting in a much more predictable data migration process.

Most apps on Heroku are already using the incredible Heroku Postgres service and should be running Postgres locally as well. If not, the Postgres.app project will get you up and running on OSX in minutes.

Install the plugin by running the following from the terminal:


$ heroku plugins:install https://github.com/ddollar/heroku-pg-transfer

Confirm the plugin installation by running the pg:transfer --help command:


$ heroku pg:transfer --help
Usage: heroku pg:transfer

 transfer data between databases

 -f, --from DATABASE  # source database, defaults to DATABASE_URL on the app
 -t, --to   DATABASE  # target database, defaults to local $DATABASE_URL

Download

pg:transfer has a very simple purpose – to transfer data from one Postgres db to another. As such it only requires two arguments, the source and target database locations (in its vernacular the “to” and “from”).

If you’re in the root folder of an app already deployed to Heroku the “from” and “to” will assumed to be at the location specified by the DATABASE_URL config var on the remote app and the local environment variable DATABASE_URL for the local db, respectively. In other words, by default, heroku pg:transfer will export from your Heroku database and import into your local development database.


$ heroku pg:transfer
Source database: HEROKU_POSTGRESQL_BLACK (DATABASE_URL) on someapp.herokuapp.com
Target database: someapp on localhost:5432

 !    WARNING: Destructive Action
 !    This command will affect the app: someapp
 !    To proceed, type "someapp" or re-run this command with --confirm someapp

> someapp
pg_dump: reading schemas
pg_dump: reading user-defined tables
...

If the local database you want to import to isn’t set in your environment you can quickly do so just for the pg:transfer command with the env utility,


$ env DATABASE_URL=postgres://localhost/someapp-dev heroku pg:transfer

Apps using Foreman for local process management can quickly provision the correct environment variables from the .env file with: source .env && heroku pg:transfer.

Upload

If you want to push data from your local environment to your Heroku database you’ll need to specify the to and from flags to reverse the default direction of the transfer. For added convenience pg:transfer is aware of the Heroku Postgres COLOR naming scheme.


$ heroku config | grep POSTGRES
HEROKU_POSTGRESQL_JADE_URL: postgres://ads8a8d9asd:al82kdau78kja@ec2-23-23-237-0.compute-1.amazonaws.com:5432/resource123

$ heroku pg:transfer --from $DATABASE_URL --to jade --confirm someapp
...

Transfer

While pushing data from a local db to a remote one is of limited usefulness a more common use-case is to transfer between two remote databases. For instance, populating a staging or test environment with production data.

To transfer data between databases on different applications specify the full connection info of the target database in the --to flag.


$ heroku pg:transfer --to `heroku config:get DATABASE_URL -a app-staging` --confirm someapp
Source database: HEROKU_POSTGRESQL_JADE on someapp.herokuapp.com
Target database: kai89akdkaoa on ec2-23-21-45-234.compute-1.amazonaws.com:5742

pg_dump: reading schemas
pg_dump: reading user-defined tables
...

Here the heroku config:get command is used to fetch the full PG connection info for the target app.

Outside Heroku

Somewhat un-intuitively for a Heroku CLI plugin, you can also use pg:transfer to transfer data between two databases that are not associated with Heroku. Since the plugin accepts raw connection URLs for both the --from and the --to locations you’re really not limited in how you use the feature.

As Postgres’ fame grows more and more platforms are supporting the database. Transferring data between Engine Yard Postgres databases, between dbs on EY and Heroku, or from EY to your local database is simple. Though you need the heroku command on your path it’s really quite agnostic.

To transfer from EY PG to your local database at the env var DATABASE_URL:


$ heroku pg:transfer --from "postgres://deploy:password@127.0.0.1:5433/dbname" --to $DATABASE_URL

To gain ingress to an EY Postgres database, follow the SSH tunnel instructions here. Hence the local 127.0.0.1 connection URL in this example.

The Heroku pg:transfer CLI plugin is a much more stable and flexible tool for migrating data between Postgres databases than the Taps gem and the corresponding db:pull or db:push commands. Use it to robustly manage data transfer between Postgres databases.

Five Interesting Things You Can Do with Heroku Buildpacks

  • static content
  • Deploy elsewhere
  • null
  • multiple buildpacks


Using Vulcan to Build Binary Application Dependencies for Heroku

Managing an application’s code dependencies, once a source of constant pain and conflict, is now a solved problem in most modern languages. Ruby has Bundler, Node.js has npm, Python has pip, Clojure has Leiningen… the list continues.

What remains unsolved is how to declare and manage system-level dependencies – external binaries on which your application is dependent. This article explores explicitly managing and building system-level dependencies using the Vulcan build server.

Problem

It’s common for an application to shell out to a local executable to perform some computationally intense work where compiled libraries are more performant or robust. Examples include Ghostscript, which your application might use to manipulate PDF or PostScript files and ImageMagick for resizing and cropping uploaded images.

Many deployment services attempt to fill this need by providing a set of common binaries bundled into every environment. This is a fragile approach that boxes you in to out-dated or incompatible library versions.

Having the ability to install system-wide binaries in your deployment environment is also poor solution. It merely shifts the burden of dependency management from the service provider to you, the application developer.

Twelve-factor is firm in its stance on system-level dependencies:

Twelve-factor apps also do not rely on the implicit existence of any system tools… While these tools may exist on many or even most systems, there is no guarantee that they will exist on all systems where the app may run in the future, or whether the version found on a future system will be compatible with the app. If the app needs to shell out to a system tool, that tool should be vendored into the app.

Solution

Vendoring a binary dependency requires that the binary be built specifically for the remote environment’s operating system and processor architecture. Even with the benefits of virtual machine software this is a non-trivial task for developers.

A better solution is to use the remote environment of your service provider to compile and build the required dependencies. Such a process loosely adheres to the following steps:

  1. Specify the required library’s source
  2. Get a remote shell to your production environment
  3. Download the library source in the remote shell
  4. Compile the library remotely
  5. Download the compiled library for use in your application’s source-tree

While this process is not a lengthy one, your eye should spot that steps 2-5 are ripe for automation. That’s where Vulcan comes in.

Vulcan is a utility that bundles and uploads a source-tree to a remote environment, runs the necessary compilation commands on the source tree, and downloads the resulting binary – all in a single command. Vulcan consists of a Ruby-based CLI and Node.js server and, though built by and for Heroku, is platform-agnostic.

Setup

Installing the Vulcan CLI is simply a matter of installing the vulcan gem:

These steps assume you have git and ruby available from the command line and have already signed up for a Heroku account. The Heroku Toolbelt can get you up and running if you’re missing any components.


$ gem install vulcan
Please run 'vulcan update' to update your build server.
Successfully installed vulcan-0.7.1
1 gem installed

Vulcan’s build-server is a simple Node.js app that runs in the same target environment your application will be deployed. If your app runs on Heroku, Vulcan can deploy itself to Heroku with vulcan create appname.

You will need a verified Heroku account before running this command as Vulcan requires the use of a (free) add-on.


$ vulcan create buildserver-you
Creating buildserver-you... done, stack is cedar
http://buildserver-you.herokuapp.com/ | git@heroku.com:buildserver-you.git
Initialized empty Git repository in /private/var/folders/Uz/UzRCgjzkGIi7Iqz9QNi6NUDrHf6/-Tmp-/d20120614-31875-1qjw1d6/.git/
Counting objects: 883, done.
...

-----> Heroku receiving push
-----> Node.js app detected
...
       Dependencies installed
-----> Discovering process types
       Procfile declares types -> web
-----> Compiled slug size is 4.1MB
-----> Launching... done, v3
       http://buildserver-you.herokuapp.com deployed to Heroku

The Vulcan build server is now running on Heroku at http://buildserver-you.herokuapp.com as a (free) single-dyno app. It’s important to note there’s nothing blessed about Vulcan running on Heroku. It’s just a normal application running in user-space, giving you all the visibility and management tools you’re used to.

If you’ve manually deployed the build server to another provider you’ll need to set its location so the CLI knows where to send build tasks. This is done by setting the VULCAN_HOST env var: $ export VULCAN_HOST=http://myserver.domain.com..

Build

The Ghostscript library utilized in the processing PDF files tutorial on the Heroku Dev Center makes for a good example of building a binary application dependency.

Download source

First, download and expand the Linux x86 64-bit source for the Ghostscript project. At the time of this writing it is located at http://downloads.ghostscript.com/public/binaries/ghostscript-9.05-linux-x86_64.tgz.

For the purpose of this article Heroku will be assumed to be the target environment. Heroku’s dynos run on 64-bit Linux kernel. If your target environment differs you will need to select the correct source distribution.


$ wget http://downloads.ghostscript.com/public/ghostscript-9.05.tar.gz
$ tar -xvzf ghostscript-9.05.tar.gz 
x ghostscript-9.05/
x ghostscript-9.05/base/
x ghostscript-9.05/base/szlibxx.h
...

You now have a Ghostscript source directory at ./ghostscript-9.05.

Remote compilation

Next, use the Vulcan CLI to initiate a build task on the build server with vulcan build. This will send the source to the target environment for compilation. The only required argument is -s, the location of the Ghostscript source directory. The -v flag (verbose) will show the output from the compilation process and is recommended as, depending on the library, compilation can take some time and it’s useful to see its progress.


$ vulcan build -v -s ./ghostscript-9.05
Packaging local directory... done
Uploading source package... done
Building with: ./configure --prefix /app/vendor/ghostscript-9 && make install
checking for gcc... gcc
checking whether the C compiler works... yes
...
>> Downloading build artifacts to: /tmp/ghostscript-9.tgz
   (available at http://buildserver-you.herokuapp.com/output/45380fa6-e02a-479b-a7af-d9afb089b81f)

On completion of the build process the resulting binaries are packaged and downloaded for you. In this example the binary package can be found locally at /tmp/ghostscript-9.tgz and remotely at http://buildserver-you.herokuapp.com/output/45380fa6-e02a-479b-a7af-d9afb089b81f.

Although the Vulcan source indicates the ability to fetch source packages directly from a URL (i.e. $ vulcan build -s http://downloads.ghostscript.com/public/ghostscript-9.05.tar.gz this tends to result in build errors. At this time the most reliable method is to manually download and expand the source before passing to vulcan.

Customization

Looking at the output from the Ghostscript example you can see that a sensible autoconf-based command is chosen for you: ./configure --prefix /app/vendor/ghostscript-9 && make install. If you need to specify a non-default build command you can do so with the -c flag. Here is an example adding the --without-ssl configure option when building wget.


$ vulcan build -v -s ./wget-1.13 -c "./configure --prefix /app/vendor/wget-1.13 --without-ssl && make install" -p /app/vendor/wget-1.13
Packaging local directory... done
Uploading source package... done
Building with: ./configure --without-ssl && make install
configure: configuring for GNU Wget 1.13
...
>> Downloading build artifacts to: /tmp/wget-1.tgz
   (available at http://buildserver-you.herokuapp.com/output/66ba3e2b-77ef-4409-acc2-fca70650c318)

The -p (prefix) flag is also used here to tell Vulcan where to look on the build server for the compiled artifacts (/app/vendor/wget-1.13). To avoid ambiguities it’s best to specify this value and to set it to the same value as the --prefix flag passed to ./configure.

Vendoring

Once you have a binary appropriate for use on Heroku you need to vendor it within your application. Though conventions vary by language the approach taken is similar across them all. Vendoring the wget executable is a straight-forward process.

Create a vendor directory that will house the wget binary and copy in the executable from the vulcan build results.


$ mkdir -p vendor/wget/bin
$ cp /tmp/wget-1/bin/wget vendor/wget/bin/

The null buildpack can be used to test wget in isolation. Create a Heroku app that consists only of this vendor directory and specify the null buildpack.


$ git init
$ git add .
$ git commit -m "Vendored wget"

$ heroku create --buildpack https://github.com/ryandotsmith/null-buildpack
Creating severe-water-5643... done, stack is cedar
BUILDPACK_URL=https://github.com/ryandotsmith/null-buildpack
http://severe-water-5643.herokuapp.com/ | git@heroku.com:severe-water-5643.git
Git remote heroku added

Add vendor/wget/bin to the app’s PATH and deploy it to Heroku.


$ heroku config:add PATH=vendor/wget/bin:/usr/bin:/bin
Setting config vars and restarting severe-water-5643... done, v6
PATH: vendor/wget/bin:/usr/bin:/bin

$ git push heroku master
...
-----> Heroku receiving push
-----> Fetching custom buildpack... done
-----> Null app detected
-----> Nothing to do.
-----> Discovering process types
       Procfile declares types -> (none)
-----> Compiled slug size is 172K
-----> Launching... done, v4
       http://severe-water-5643.herokuapp.com deployed to Heroku

To git@heroku.com:severe-water-5643.git
 * [new branch]      master -> master

To test that the compiled version of wget works use a one-off dyno to run a test command:


$ heroku run wget http://www.google.com/images/logo_sm.gif
Running wget http://www.google.com/images/logo_sm.gif attached to terminal... up, run.1
...
HTTP request sent, awaiting response... 200 OK
Length: 3972 (3.9K) [image/gif]
Saving to: 'logo_sm.gif'

100%[=============================================================>] 3,972       --.-K/s   in 0s

2012-07-10 14:48:48 (200 MB/s) - 'logo_sm.gif' saved [3972/3972]

The command was successfully invoked on Heroku using the vulcan-compiled wget executable and the output (but not the fetched file) was streamed to your local shell.

Visibility

There are several utilities available to you to introspect the remote compilation process.

Logging

Outside of using the -v flag to force see the build output you can also use the logs of the Vulcan build server to gain better visibility into the process. Since the Vulcan build server is just a Node.js app running in your target environment this is a trivial task. On Heroku use heroku logs and the -a flag with the name of your build server app.


$ heroku logs -t -a buildserver-you
2012-07-04T15:29:20+00:00 app[web.1]: [7f3a7510-400a-44d4-9132-66a2e6c878a5] spawning build
2012-07-04T15:29:20+00:00 app[web.1]: valid socket
2012-07-04T15:29:21+00:00 heroku[run.1]: Awaiting client
2012-07-04T15:29:21+00:00 heroku[run.1]: Starting process with command `bin/make "7f3a7510-400a-44d4-9132-66a2e6c878a5"`
2012-07-04T15:29:22+00:00 heroku[run.1]: State changed from starting to up
2012-07-04T15:30:10+00:00 heroku[run.1]: Process exited with status 1
2012-07-04T15:30:10+00:00 heroku[run.1]: State changed from up to complete

By default Vulcan will spawn a on-off dyno to perform the build command. This is evident by the run.1 dyno in the log output. In some circumstances this can result in a billable event if your web and one-off dyno usage combined exceeds 750 hours for any one month.

If this small overage is meaningful to you you can sacrifice compilation concurrency and force Vulcan to execute the compilation command in-process with the SPAWN_ENV config var.


$ heroku config:add SPAWN_ENV=local -a buildserver-you
Setting config vars and restarting buildserver-you... done, v11
SPAWN_ENV: local

The build command will then execute within the web process:


$ heroku logs -t -a buildserver-you
2012-07-04T15:35:43+00:00 app[web.1]: [bbeab2b8-3941-4d41-94af-24c4f0fa65c0] spawning build
2012-07-04T15:36:33+00:00 app[web.1]: 10.125.41.68 - - [Wed, 04 Jul 2012 15:36:33 GMT] "GET /output/bbeab2b8-3941-4d41-94af-24c4f0fa65c0 HTTP/1.1" 200 - "-" "Ruby"
2012-07-04T15:36:33+00:00 heroku[router]: GET buildserver-you.herokuapp.com/output/bbeab2b8-3941-4d41-94af-24c4f0fa65c0 dyno=web.1 queue=0 wait=0ms service=35ms status=200 bytes=75

While this fails to adhere to the background job pattern it may be acceptable for your use-case.

Remote shell

If a build fails it is often useful to be able to view the failed artefacts. Since Vulcan performs its work in temporary directories their contents are cleaned up after each build. However, a remote shell can be used to manually invoke the build command and navigate the build results.

At the start of every build request Vulcan outputs log statements resembling the following:


2012-07-04T18:57:10+00:00 app[web.1]: [6869e68a-492d-4a7e-8b27-64352811d7dc] saving to couchdb
2012-07-04T18:57:10+00:00 app[web.1]: [6869e68a-492d-4a7e-8b27-64352811d7dc] saving attachment - [id:6869e68a-492d-4a7e-8b27-64352811d7dc rev:1-722a96f6734a3511efd73b7cfb9a2aed]

The attachment id, here 6869e68a-492d-4a7e-8b27-64352811d7dc, is all that’s needed to manually invoke the build yourself. Establish a remote shell to the Vulcan build server environment. On Heroku you can use heroku run bash:


$ heroku run bash -a buildserver-you
~ $

Then invoke the bin/make command with the attachment id. You will see the output of the build process and can then browse the output directory yourself.


$ bin/make "6869e68a-492d-4a7e-8b27-64352811d7dc"
configure: configuring for GNU Wget 1.13
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
...
$ cd /app/vendor/wget-1.13

Updates

Updating Vulcan is simple. Update the CLI using ruby gems:


$ gem install vulcan
Please run 'vulcan update' to update your build server.
Successfully installed vulcan-0.8.0
1 gem installed

And use the vulcan update command to update the build server.

Be aware that updating the build server while active compilations are running will cause them to be aborted.


$ vulcan update
Initialized empty Git repository in /private/var/folders/tt/7f38d4b14qq5xglpj3yl0smr0000gn/T/d20120704-53816-m4n0rn/.git/
Counting objects: 883, done.
...

-----> Heroku receiving push
-----> Node.js app detected
-----> Resolving engine versions
       Using Node.js version: 0.6.18
       Using npm version: 1.1.4
...
-----> Launching... done, v15
       http://buildserver-you.herokuapp.com deployed to Heroku

To git@heroku.com:buildserver-you.git
 + a5f27be...a934704 master -> master (forced update)

Troubleshooting

Invalid secret

If working across multiple development environments or some other non-default workflow you may see the following build server error logged when attempting to invoke a build:


2012-07-04T17:39:47+00:00 app[web.1]: [672b5df6-ad8c-49ed-9831-515207e2dc4f] ERROR: invalid secret
2012-07-04T17:39:47+00:00 app[web.1]: invalid secret

This occurs when the CLI secret hash, created when the build server was created with vulcan create, either doesn’t exist or doesn’t match the server-side secret. The most common cause is that the ~/.vulcan configuration file doesn’t exist in your environment. You can create it with the following contents:

~/.vulcan

--- 
:app: buildserver-you
:host: buildserver-you.herokuapp.com
:secret: reallylonghash12df

If you don’t have access to your original .vulcan file you can find your secret on Heroku using heroku config:


$ heroku config:get SECRET -a buildserver-you
reallylonghash12df

Copy your ~/.vulcan file to each development machine from which you wish to invoke builds.

Heroku binaries

During the course of writing this article the following binaries were compiled for use on Heroku.

If you’d like to list a Heroku binary here, please send a pull request.

Library Build command Binary Contributor
GNU Wget v1.13 vulcan build -v -s ./wget-1.13 -c "./configure --prefix /app/vendor/wget-1.13 --without-ssl && make install" -p /app/vendor/wget-1.13 download @rwdaigle
ImageMagick v6.7.8-1 vulcan build -v -s ./ImageMagick-6.7.8-1 download @rwdaigle
Ghostscript v9.05 vulcan build -v -s ./ghostscript-9.05 download @rwdaigle

Deploying a Nesta CMS Blog with Pygments Syntax Highlighting to Heroku

Blogging and/or setting up a simple site should be a simple proposition. There are a lot of great frameworks out there that handle the software portion of running such a site. However, you don’t just want a stock setup. You have to take into account proper asset caching for performance, slick syntax highlighting, an aesthetically pleasing theme, app instrumentation, feed redirection and production deployment.

I’ve gone ahead and boiled down all these concerns into just a few steps based on the Nesta CMS framework.

If you’re not much for foreplay a fully deployable starter template of this site can be found here on GitHub and seen running here on Heroku.

Background

Having wrestled with quite a few blogging engines in the past I had several requirements of a new setup. Firstly it had to support a workflow that lets me write on my local machine using the tools I prefer, namely markdown formatted articles composed with IA Writer or a basic text editor.

Second, it had to support a git-based workflow. My content is going to live in git and there’s no reason the publishing platform shouldn’t build on top of that as well. This also plays well with Heroku deployments.

Static site generators are all the rage and fulfill the first two requirements. However, I’ve found them to be rather rigid and obtrusive for the very incremental edit-view-edit workflow I assume when writing. My last requirement was that I could write and immediately refresh my browser to see the fully rendered site running locally. Waiting for the whole site to generate on every minor edit proved to be far too slow for me in the past.

Fortunately, there’s a better way.

Landscape

The list of dynamic file-backed Heroku-friendly blog engines isn’t particularly long. I investigated both Toto and Nesta CMS and, after a brief wrestle trying to get Toto’s HTTP request headers to play nice with rack-cache, settled on Nesta. Nesta is under active development and is written with Sinatra, the very simple and hackable web framework for Ruby.

For deployment Heroku is the obvious choice given its seamless git-based workflow and variety of add-ons. I also work there.

These steps assume you have git and ruby available from the command line and have already signed up for a Heroku account. The Heroku Toolbelt can get you up and running if you’re missing any components.

Template

Though the Nesta quick-start is solid, as are all their docs, we can skip ahead by using an app template. I’ve created one on github that’s already setup for syntax highlighting with Pygments, the “clean” theme you see running this site and the minimal artefacts needed to quickly deploy and provision a full-featured Heroku app.

Fork the starter template using the “Fork” button on the template GitHub page.

Fork starter template screenshot

This will fork it to your GitHub account. From there you can clone your fork locally. Find the repository URL for your fork and copy it (your URL will differ from the one shown below).

Repository URL screenshot

Clone the app template to your local environment using git. Use the domain name of your site instead of mysite.com


$ git clone git@github.com:rwdaigle/nesta-app-template.git mysite.com
Cloning into mysite.com...
remote: Counting objects: 72, done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 72 (delta 29), reused 63 (delta 20)
Receiving objects: 100% (72/72), 11.69 KiB, done.
Resolving deltas: 100% (29/29), done.

The application’s source is now installed locally in the mysite.com directory.

Run

Now that the site template is present in the local environment you can install required dependencies and render the site locally before deploying to a remote server environment. A bootstrap.sh script is provided for your convenience.

The bootstrap.sh script does not use sudo or make any destructive commands. However, please review the script source before executing.

$ cat bootstrap.sh
$ ./bootstrap.sh 
Using RedCloth (4.2.9) 
Using addressable (2.2.7) 
# ...
Submodule path 'themes/clean': checked out '889e094749008d2bf4ecf901555fce44c7f7bc87'

Once bootstrap has finished start the app using the foreman utility.


$ foreman start
14:25:47 web.1     | started with pid 59647

Opening http://localhost:5000 should display the site running with a single getting started article listed on the home page. Any errors that occur will be shown in the terminal where you entered the foreman start command.

Deploy

Assuming you have a Heroku account and have successfully installed the Heroku Toolbelt you can use the provided helper script to quickly deploy the site install any dependencies and setup the appropriate configuration.

The app deployed to Heroku will not incur any charges on Heroku.

$ cat deploy.sh
# ... review script source ...

$ ./deploy.sh 
Creating vivid-sword-9170... done, stack is cedar
Adding memcache to vivid-sword-9170... done
# ...
Opening http://vivid-sword-9170.herokuapp.com/

Next

You’ve forked your own copy of the app template, got it running locally and deployed it for free to Heroku. Not bad for a few minutes of your time! To customize the site, setup analytics and write your first post go ahead and read the welcome post included in your new site (a copy can be found here).

Welcome post screenshot

My hope is this template and theme eliminates many of the sticking points associated with taking a great framework like Nesta and turning it into a running, usable and deployed site. Let me know if you run into any issues (or better yet, submit a pull request to the template or theme projects on GitHub).


RyanDaigle.com 2012-02-12 17:00:00

As seems to be tradition amongst the Nerderati it’s common to explain the particulars of your blogging setup anytime there’s a change. Today is such an occasion. After a significant hiatus of any meaningful writing I’ve decided it’s time to get back on the saddle.

If you’re not much for foreplay a fully deployable version of this site can be found at http://github.com/rwdaigle.

Requirements

Having wrestled with quite a few blogging engines in the past I had several requirements of a new setup. Firstly it had to support a workflow that lets me write on my local machine using the tools I prefer, namely markdown formatted articles composed with IA Writer or a basic text editor.

Second, it had to support a git-based workflow. My content is going to live in git and there’s no reason the publishing platform shouldn’t build on top of that as well. This also plays well with Heroku deployments.

Static site generators are all the rage and fulfill the first two requirements. However, I’ve found them to be rather rigid and obtrusive for the very incremental edit-view-edit workflow I assume when writing. My last requirement was that I could write and immediately refresh my browser to see the fully rendered site running locally. Waiting for the whole site to generate on every minor edit proved to be far too slow for me in the past.

Fortunately, there’s a better way.

Choices

The list of dynamic file-backed Heroku-friendly blog engines isn’t particularly long. I investigated both Toto and Nesta CMS and, after a brief wrestle trying to get Toto’s HTTP request headers to play nice with rack-cache, settled on Nesta. Nesta is under active development and is written with Sinatra, the very simple and hackable web framework for Ruby.

For deployment Heroku is the obvious choice given its seamless git-based workflow and variety of add-ons. I also work there.

Overview

Deploying a Nesta backed blog to Heroku is quite simple. However, you don’t just want a stock setup. You have to take into account proper asset caching for performance, slick syntax highlighting, an aesthetically pleasing theme, app instrumentation, and production deployment. I’ve gone ahead and boiled down all these concerns into just a few steps.

These steps assume you have git and ruby available from the command line and have already signed up for a Heroku account. The Heroku Toolbelt can get you up and running if you’re missing any components.

Clone Nesta CMS starter app

Though the Nesta quick-start is solid, as are all their docs, we can skip ahead by using an app template. I’ve created one on github that’s already setup for syntax highlighting with Pygments, the “clean” theme you see running this site and the minimal artifacts needed to quickly deploy and provision a full-featured Heroku app.

Clone the app template to your local environment using git. Use the domain name of your site instead of mysite.com


$ git clone XXX mysite.com

Install the required Ruby dependencies using Bundler.


$ bundle install

That’s all it takes to get the app installed locally. Start the app with:


$ foreman start

and view it in your browser at http://localhost:5000.

Provision Heroku app

Deploy

Write

Options

cache timeout heroku addons nesta themes pygments CSS


RyanDaigle.com 2012-02-11 17:00:00

This is a test of the various markup capabilities of Pygments running within Ruby on Heroku styling a Nesta CMS backed blog.

Languages

ruby

def greeting
  'Hello World!'
end
lib/heroku.js

var request = require('request'),
  fs = require('fs'),
  spawn = require('child_process').spawn,
  Hash = require('hashish');;

var version = JSON.parse(fs.readFileSync('package.json','utf8')).version;

Command prompts


$ curl "http://gist.github.com/raw/13212qw" > test.txt


Site Relaunch

I’m currently in the middle of putting a fresh coat of paint on RyanDaigle.com, the previous home of the What’s new in Edge Rails series. All old links to http://ryandaigle.com/articles will be redirected to an archived version of the site at http://archives.ryandaigle.com. I doubt all links will make the change, but I’ve given it my best effort.

Don’t have any big plans for the site beyond a renewed effort at writing more. Stay tuned.

What’s New in Edge Rails: Skinny on Scopes

I go into a detailed explanation of using ActiveRecord scopes in Rails 3 over on EdgeRails.info.

I won’t be cross-posting for too much long, so update your feed to the new EdgeRails feed to keep abreast of the latest and greatest!



‘What’s New in Edge Rails’ Moves to EdgeRails.info

For awhile I’ve wanted to move the “What’s New in Edge Rails” series to its own site to reflect the fact that it is an independent and self-sustaining series and not some small figment of my mind anymore. I started writing the What’s New series about four years ago and it’s clear it needs to be treated like a first-class citizen. While the move is still a work in progress, I’m proud to say that EdgeRails.info is now live and is where all future What’s New in Edge Rails content will be published (including some Rails 3 updates).

<image src=”http://ryandaigle.com/assets/2010/2/5/edgerails.png” border=”0″ style=”padding: 5px; float: right;” />

I won’t repeat too much here, but one of the big changes is that I want to take a much more community driven approach to bringing you the latest in updates to the framework and will be harnessing a GitHub-centric process towards letting you both contribute and update posts.

So update your feed and head over to EdgeRails.info. I’m all ears, so flame away if you’re feeling so inclined. And thanks for all your contributions, comments and feedback that past four years – they’ve made the work worthwhile and I hope I can continue the momentum on the new site.

I’ll probably give EdgeRails.info a few weeks to stand on its own before flipping the DNS switch, at which point all links to articles here will be redirected to EdgeRails.

See you on the flip side, home-slice.



What’s New in Edge Rails: Set Flash in redirect_to


This feature is schedule for: Rails v2.3 stable


Rails’ flash is a convenient way of passing objects (though mostly used for message strings) across http redirects. In fact, every time you set a flash parameter the very next step is often to perform your redirect w/ redirect_to:

1
2
3
4
5
6
7
class UsersController < ApplicationController
  def create
    @user = User.create(params[:user])
    flash[:notice] = "The user was successfully created"
    redirect_to user_path(@user)
  end
end

I know I hate to see two lines of code where one makes sense – in this case what you’re saying is to “redirect to the new user page with the given notice message” – something that seems to make more sense as a singular command.

DHH seems to agree and has added :notice, :alert and :flash options to redirect_to to consolidate commands. :notice and :alert automatically sets the flash parameters of the same name and :flash let’s you get as specific as you want. For instance, to rewrite the above example:

1
2
3
4
5
6
class UsersController < ApplicationController
  def create
    @user = User.create(params[:user])
    redirect_to user_path(@user), :notice =>"The user was successfully created"
  end
end

Or to set a non :alert/:notice flash:

1
2
3
4
5
6
class UsersController < ApplicationController
  def create
    @user = User.create(params[:user])
    redirect_to user_path(@user), :flash => { :info => "The user was successfully created" }
  end
end

I’ve become accustomed to setting my flash messages in :error, :info and sometimes :notice making the choice to provide only :alert and :notice accessors fell somewhat constrained to me, but maybe I’m loopy in my choice of flash param names.

Whatever your naming scheme, enjoy the new one-line redirect!

tags: ruby,
rubyonrails



What’s New in Edge Rails: Independent Model Validators


This feature is schedule for: Rails v3.0


ActiveRecord validations, ground zero for anybody learning about Rails, got a lil’ bit of decoupling mojo today with the introduction of validator classes. Until today, the only options you had to define a custom validation was by overriding the validate method or by using validates_each, both of which pollute your models with gobs of validation logic.

ActiveRecord Validators

Validators remedy this by containing granular levels of validation logic that can be reused across your models. For instance, for that classic email validation example we can create a single validator:

1
2
3
4
5
6
class EmailValidator < ActiveRecord::Validator
  def validate()
    record.errors[:email] << "is not valid" unless
      record.email =~ /^([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})$/i
  end
end

Each validator should implement a validate method, within which it has access to the model instance in question as record. Validation errors can then be added to the base model by adding to the errors collection as in this example.

So how do you tell a validator to operate on a model? validates_with that takes the class of the validator:

1
2
3
class User < ActiveRecord::Base
  validates_with EmailValidator
end

Validation Arguments

This is all well and good, but is a pretty brittle solution in this example as the validator is assuming an email field. We need a way to pass in the name of the field to validate against for a model class that is unknown until runtime. We can do this by passing in options to validates_with which are then made available to the validator at runtime as the options hash. So let’s update our email validator to operate on an email field that can be set by the model requiring validation:

1
2
3
4
5
6
7
class EmailValidator < ActiveRecord::Validator
  def validate()
    email_field = options[:attr]
    record.errors[email_field] << "is not valid" unless
      record.send(email_field) =~ /^([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})$/i
  end
end

And to wire it up from the user model:

1
2
3
class User < ActiveRecord::Base
  validates_with EmailValidator, :attr => :email_address
end

Any arguments can be passed into your validators by hitching a ride onto this options hash of validates_with.

Options & Notes

There are also some built-in options that you’ll be very familiar with, namely :on, :if and :unless that define when the validation will occur. They’re all the same as the options to built-in validations like validates_presence_of.

1
2
3
4
class User < ActiveRecord::Base
  validates_with EmailValidator, :if => Proc.new  { |u| u.signup_step > 2 },
    :attr => :email_address
end

It’s also possible to specify more than one validator with validates_with:

1
2
3
class User < ActiveRecord::Base
  validates_with EmailValidator, ZipCodeValidator, :on => :create
end

While this might seem like a pretty minor update, it allows for far better reusability of custom validation logic than what’s available now. So enjoy.

tags: ruby,
rubyonrails



What’s New in Edge Rails: Default RESTful Rendering


This feature is schedule for: Rails v3.0


A few days ago I wrote about the new respond_with functionality of Rails 3. It’s basically a clean way to specify the resource to send back in response to a RESTful request. This works wonders for successful :xml and :json requests where the default response is to send back the serialized form of the resource, but still presents a lot of cruft when handling user-invoked :html requests (i.e. ‘navigational’ requests) and requests where error handling is required. For instance, consider your standard create action:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class UsersController < ApplicationController::Base

  respond_to :html, :xml, :json

  def create

    @user = User.new(params[:user])

    # Have to always override the html format to properly
    # handle the redirect
    if @user.save
      flash[:notice] = "User was created successfully."
      respond_with(@user, :status => :created, :location => @user) do |format|
        format.html { redirect_to @user }
      end

    # Have to send back the errors collection if they exist for xml, json and
    # redirect back to new for html.
    else
      respond_with(@user.errors, :status => :unprocessable_entity) do |format|
        format.html { render :action => :new }
      end
    end

  end
end

Even with the heavy lifting of respond_with you can see that there’s still a lot of plumbing left for you to do – plumbing that is mostly the same for all RESTful requests. Well José and the Rails team have a solution to this and have introduced controller responders.

Controller Responders

Controller responders handle the chore of matching the HTTP request method and the resource format type to determine what type of response should be sent. And since REST is so well-defined it’s very easy to establish a default responder to handle the basics.

Here’s what a controller utilizing responder support (now baked into respond_with) looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class UsersController < ApplicationController::Base

  respond_to :html, :xml, :json

  def index
    respond_with(@users = User.all)
  end

  def new
    respond_with(@user = User.new)
  end

  def create
    respond_with(@user = User.create(params[:user]))
  end

  def edit
    respond_with(@user = User.find(params[:id]))
  end

  def update
    @user = User.find(params[:id])
    @user.update_attributes(params[:user])
    respond_with(@user)
  end
end

The built-in responder performs the following logic for each action:

  • If the :html format was requested:
    • If it was a GET request, invoke render (which will display the view template for the current action)
    • If it was a POST request and the resource has validation errors, render :new (so the user can fix their errors)
    • If it was a PUT request and the resource has validation errors, render :edit (so the user can fix their errors)
    • Else, redirect to the resource location (i.e. user_url)
  • If another format was requested, (i.e. :xml or :json)
    • If it was a GET request, invoke the :to_format method on the resource and send that back
    • If the resource has validation errors, send back the errors in the requested format with the :unprocessable_entity status code
    • If it was a POST request, invoke the :to_format method on the resource and send that back with the :created status and the :location of the new created resource
    • Else, send back the :ok response with no body

Wading through this logic tree you can see that the default logic for each RESTful action is appropriately handled, letting your controller actions focus exclusively on resource retrieval and modification. And with that cruft out of the way your controllers will start to look even more similar – I suspect we’ll be seeing a solution for this coming around the bend shortly as well…?

So, just to recap the basics, here are a few action implementations side by side (the first being before responders and the latter being after):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Old
def index
  @users = User.all
  respond_to do |format|
    format.html
    format.xml { render :xml => @users }
    format.json { render :json => @users }
  end
end

# New
def index
  respond_with(@users = User.all)
end
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Old
def create
  @user = User.new(params[:user])
  if @user.save
    flash[:notice] = "User successfully created"
    respond_to do |format|
      format.html { redirect_to @user }
      format.xml { render :xml => @user, :status => :created,
        :location => user_url(@user) }
      format.json { render :json => @users, :status => :created,
        :location => user_url(@user) }
    end
  else
    respond_to do |format|
      format.html { render :new }
      format.xml { render :xml => @user.errors, :status => :unprocessable_entity }
      format.json { render :json => @user.errors, :status => :unprocessable_entity }
    end
  end
end

# New
def create
  @user = User.new(params[:user])
  flash[:notice] = "User successfully created" if @user.save
  respond_with(@user)
end

Oh yeah, that’s getting real lean.

Overriding Default Behavior

If you need to override the default behavior of a particular format you can do so by passing a block to respond_with (as I wrote about in the original article):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class UsersController < ApplicationController::Base

  respond_to :html, :xml, :json

  # Override html format since we want to redirect to the collections page
  # instead of the user page.
  def create
    @user = User.new(params[:user])
    flash[:notice] = "User successfully created" if @user.save
    respond_with(@user) do |format|
      format.html { redirect_to users_url }
    end
  end
end

Nested Resources

It’s quite common to operate on resources within a nested resource graph (though I prefer to go one level deep, at most). For such cases you need to let respond_with know of the object hierarchy (using the same parameters as polymorphic_url):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class UsersController < ApplicationController::Base

  respond_to :html, :xml, :json

  # In this case, users exist within a company
  def create
    @company = Company.find(params[:company_id])
    @user = @company.users.build(params[:user])
    flash[:notice] = "User successfully created" if @user.save

    # Ensure that the new user location is nested within @company,
    # for html format (/companies/1/users/2.html) as well as
    # resource formats (/companies/1/users/2)
    respond_with(@company, @user)
  end
end

If you have a singleton resource within your resource graph just use a symbol instead of an actual object instance. So to get /admin/users/1 you would invoke respond_with(:admin, @user).

Custom Responders

While there’s no facility to provide your own responder classes, it will no doubt be added shortly. If you look at the current responder class definition, it’s a very simple API essentially only requiring a call method (more intuitively take a look at the :to_html and :to_format methods).

Stay tuned here for further refinements to this very handy functionality – you’re going to see a lot more tightening in the coming weeks.

tags: ruby,
rubyonrails



What’s New in Edge Rails: Cleaner RESTful Controllers w/ respond_with


This feature is schedule for: Rails v3.0


REST is a first-class citizen in the Rails world, though most of the hard work is done at the routing level. The controller stack has some niceties revolving around mime type handling with the respond_to facility but, to date, there’s not been a lot built into actionpack to handle the serving of resources. The addition of respond_with (and this follow-up) takes one step towards more robust RESTful support with an easy way to specify how resources are delivered. Here’s how it works:

Basic Usage

In your controller you can specify what resource formats are supported with the class method respond_*to*. Then, within your individual actions, you tell the controller the resource or resources to be delivered using respond_*with*:

1
2
3
4
5
6
7
8
9
10
11
12
13
class UsersController < ApplicationController::Base

  respond_to :html, :xml, :json

  def index
    respond_with(@users = User.all)
  end

  def create
    @user = User.create(params[:user])
    respond_with(@user, :location => users_url)
  end
end

This will match each supported format with an appropriate response. For instance, if the request is for /users.xml then the controller will look for a /users/index.xml.erb view template to render. If such a view template doesn’t exist then it tries to directly render the resource in the :xml format by invoking to_xml (if it exists). Lastly, if respond_with was invoked with a :location option the request will be redirected to that location (as in the case of the create action in the above example).

So here’s the equivalent implementation without the use of respond_with (assuming no index view templates):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class UsersController < ApplicationController::Base

  def index
    @users = User.all
    respond_to do |format|
      format.html
      format.xml { render :xml => @users }
      format.json { render :json => @users }
    end
  end

  def create
    @user = User.create(params[:user])
    respond_to do |format|
      format.html { redirect_to users_url }
      format.xml { render :xml => @user }
      format.json { render :json => @user }
    end
  end
    
end

You can see how much boilerplate response handling is now handled for you especially if it’s multiplied over the other default actions. You can pass in :status and :head options to respond_with as well if you need to send these headers back on resources rendered directly (i.e. via to_xml):

1
2
3
4
5
6
7
8
class UsersController < ApplicationController::Base

  respond_to :html, :xml, :json

  def index
    respond_with(@users = User.all, :status => :ok)
  end
end

Per-Action Overriding

It’s also possible to override standard resource handling by passing in a block to respond_with specifying which formats to override for that action:

1
2
3
4
5
6
7
8
9
10
11
12
13
class UsersController < ApplicationController::Base

  respond_to :html, :xml, :json

  # Override html format since we want to redirect to a different page,
  # not just serve back the new resource
  def create
    @user = User.create(params[:user])
    respond_with(@user) do |format|
      format.html { redirect_to users_path }
    end
  end
end

:except And :only Options

You can also pass in :except and :only options to only support formats for specific actions (as you do with before_filter):

1
2
3
4
5
class UsersController < ApplicationController::Base
  respond_to :html, :only => :index
  respond_to :xml, :json, :except => :show
  ...
end

The :any Format

If you’re still want to use respond_to within your individual actions this update has also bundled the :any resource format that can be used as a wildcard match against any unspecified formats:

1
2
3
4
5
6
7
8
9
10
11
12
class UsersController < ApplicationController::Base

  def index

    @users = User.all

    respond_to do |format|
      format.html
      format.any(:xml, :json) { render request.format.to_sym => @users }
    end
  end
end

So all in all this is a small, but meaningful, step towards robust controller-level REST support. I should point out that the contributor of this patch is José Valim who has authored the very robust inherited_resources framework that already has support for respond_with-like functionality and many more goodies. If you’re on the search for a solid RESTful controller framework to accompany Rails’ native RESTful routing support I would suggest you take a look at his fine work.

tags: ruby,
rubyonrails



Rubyists, Learn Some iPhone Skillz

If any of you Rubyists are going to be attending the FutureRuby in Toronto this July and have an interest in learning how to work some magic on the iPhone, I encourage you to check out Mobile Orchard’s Dan Grigsby and his Beginning iPhone Programming For Rubyists course taking place before FutureRuby. In addition to the iPhone basics, he’ll be covering our ObjectiveResource framework.

You can get a discount on the course if you register before June 9th so head on over and give it a peek.



What’s New in Edge Rails: Database Seeding


This feature is schedule for: Rails v3.0


I’m not sure if this was ever stated explicitly has a preferred practice or not, but for the longest time many of us have recognized that using migrations as a way to populate the database with a base configuration dataset is wrong. Migrations are for manipulating the structure of your database, not for the data within it and certainly not for simple population tasks.

Well, this practice now has a formal support in Rails with the addition of the database seeding feature. Quite simply this is a rake task that sucks in the data specified in a db/seeds.rb. Here are the details:

Specify Seed Data

Add or open the db/seeds.rb file and put in model creation statements (or any ruby code) for the data you need to be present in order for your application to run. I.e. configuration and default data (and nothing more):

1
2
[:admin, :user].each { |r| Role.create(:name => r) }
User.create(:login => 'admin', :role => Role.find_by_name('admin'))

Load the Data

Once that is in place you can run one of two rake tasks that will populate the database with this data:
rake db:seed which will only populate the db with this data and rake db:setup which will create the db, load the schema and then load the seed data. This is the task you’ll want to use if you’re starting in a fresh environment.

So, quit overloading your migrations with seed data and use this new facility. But, don’t go overboard and use seeds.rb for test or staging datasets – it should only be used for the base data that is necessary for your app to run.

tags: ruby,
rubyonrails