Routing and Web Performance on Heroku: a FAQ

Hi. I'm Adam Wiggins, cofounder and CTO of Heroku.

Heroku has been my life’s work. Millions of apps depend on us, and I take that responsibility very personally.

Recently, Heroku has faced criticism from the hacker community about how our HTTP router works, and about web performance on the platform in general. I’ve read all the public discussions, and have spent a lot of time over the past month talking with our customers about this subject.

The concerns I've heard from you span past, present, and future.

The past: some customers have hit serious problems with poor web performance and insufficient visibility on their apps, and have been left very frustrated as a result. What happened here? The present: how do you know if your app is affected, and if so what should you do? And the future: what is Heroku doing about this? Is Heroku a good place to run and scale an app over the long term?

To answer these questions, we’ve written a FAQ, found below. It covers what happened, why the router works the way that it does, whether your app is affected by excessive queue time, and what the solution is.

As to the future, here’s what we’re doing. We’re ramping up hands-on migration assistance for all users running on our older stack, Bamboo, or running a non-concurrent backend on our new stack, Cedar. (See the FAQ for why this is the fix.) We’re adding new features such as 2X dynos to make it easier to run concurrent backends for large Rails apps. And we're making performance and visibility a bigger area of product attention, starting with some tools we've already released in the last month.

If you have a question not answered by this FAQ, post it as a comment here, on Hacker News, or on Twitter. I’ll attempt to answer all such questions posted in the next 24 hours.

To all our customers who experienced real pain from this: we're truly sorry. After reading this FAQ, I hope you feel we're taking every reasonable step to set things right, but if not, please let us know.

Adam


Overview

Q. Is Heroku’s router broken?

A. No. While hundreds of pages could be written on this topic, we’ll address some of this in Routing technology. Summary: the current version of the router was designed to provide the optimum combination of uptime, throughput, and support for modern concurrent backends. It works as designed.

Q. So what’s this whole thing about then?

A. Since early 2011, high-volume Rails apps that run on Heroku and use single-threaded web servers sometimes experienced severe tail latencies and poor utilization of web backends (dynos). Lack of visibility into app performance, including incorrect queue time reporting prior to the New Relic update in February 2013, made diagnosing these latencies (by customers, and even by Heroku’s own support team) very difficult.

Q. What types of apps are affected?

A. Rails apps running on Thin, with six or more dynos, and serving 1k reqs/min or more are the most likely to be affected. The impact becomes more pronounced as such apps use more dynos, serve more traffic, or have large request time variances.

Q. How can I tell if my app is affected?

A. Add the free version of New Relic (heroku addons:add newrelic) and install the latest version of the newrelic_rpm gem, then watch your queue time. Average queue times above 40ms are usually indicative of a problem.

Some apps with lower request volume may be affected if they have extremely high request time variances (e.g., HTTP requests lasting 10+ seconds) or make callbacks like this OAuth example.

Q. What’s the fix?

A. Switch to a concurrent web backend like Unicorn or Puma on JRuby, which allows the dyno to manage its own request queue and avoid blocking on long requests.

This requires that your app be on our most current stack, Cedar.

Q. Can you give me some help with this?

A. Certainly. We’ve already emailed all customers with apps running on Thin with more than six dynos with self-migration instructions, and a way to reach us for direct assistance.

If you haven’t received the email and want help making the switch, contact us for migrating to Cedar or migrating to Unicorn.

Routing technology

Q. Why does the router work the way that it does?

A. The Cedar router was built with two goals in mind: (1) to support the new world of concurrent web backends which have become the standard in Ruby and all other language communities; and (2) to handle the throughput and availability needs of high-traffic apps.

Read detailed documentation of Heroku’s HTTP routing.

Q. Even with concurrent web backends, wouldn’t a single global request queue still use web dynos more efficiently?

A. Probably, but it comes with trade-offs for availability and performance. The Heroku router favors availability, stateless horizontal scaling, and low latency through individual routing nodes. Per-app global request queues require a sacrifice on one or more of these fronts. See Kyle Kingsbury’s post on the CAP theorem implications for global request queueing.

After extensive research and experimentation, we have yet to find either a theoretical model or a practical implementation that beats the simplicity and robustness of random routing to web backends that can support multiple concurrent connections.

Q. So does that mean you aren’t working on improving HTTP performance?

A. Not at all. We're always looking for new ways to make HTTP requests on Heroku faster, more reliable, and more efficient. For example, we’ve been experimenting with backpressure routing for web dynos to signal to the router that they are overloaded.

You, our customers, have told us that it’s not routing algorithms you ultimately care about, but rather overall web performance. You want to serve HTTP requests as quickly as possible, for fast page loads or API calls for your users. And you want to be able to quickly and easily diagnose performance problems.

Performance and visibility are what matters, and that’s what we’ll work on. This will include ongoing improvements to dynos, the router, visibility tools, and our docs.

Retrospective

Q. Did the Bamboo router degrade?

A. Yes. Our older router was built and designed during the early years of Heroku to support the Aspen and later the Bamboo stack. These stacks did not support concurrent backends, and thus the router was designed with a per-app global request queue. This worked as designed originally, but then degraded slowly over the course of the next two years.

Q. Were the docs wrong?

A. Yes, for Bamboo. They were correct when written, but fell out of date starting in early 2011. Until February 2013, the documentation described the Bamboo router only sending one connection at a time to any given web dyno.

Q. Why didn’t you update Bamboo docs in 2011?

A. At the time, our entire product and engineering team was focused on our new product, Cedar. Being so focused on the future meant that we slipped on stewardship of our existing product.

Q. Was the "How It Works" section of the Heroku website wrong?

A. Yes. Similar to the docs, How It Works section of our website described the router as tracking which dynos were tied up by long HTTP requests. This was accurate when written, but gradually fell out of date in early 2011. Unlike the docs, we completely rewrote the homepage in June of 2011 and it no longer referenced tracking of long requests.

Q. Was the queue time metric in New Relic wrong?

A. Yes, for the same 2011—2013 period from previous questions. The metric was transmitted to the New Relic instrumentation in the app via a set of HTTP headers set by the Heroku router. The root cause was the same as the Bamboo router degradation: the code didn't change, but scaling out the router nodes caused the data to become increasingly inaccurate and eventually useless. With New Relic's help, we fixed this in February 2013 by calculating queue time using a different method.

Q. Why didn’t Heroku take action on this until Rap Genius went public?

A. We’re sorry that we didn’t take action on this based on the customer complaints via support tickets and other channels sooner. We didn’t understand the magnitude of the confusion and frustration caused by the out-of-date Bamboo docs, incorrect queue time information in New Relic, and the general lack of visibility into web performance on the platform. The huge response to the Rap Genius post showed us that this touched a nerve in our community.

The Future

Q. What are we doing to make things right from here forward?

A. We’ve been working with many of our customers to get their queue times down, get them accurate visibility into their app’s performance, and make sure their app is fast and running on the right number of dynos. So far, the results are good.

Q. What about everyone else?

A. If we haven’t been in touch yet, here’s what we’re doing for you:

  • Migration assistance: We’ll give you hands-on help migrating to a concurrent backend, either individually or in online workshops. This includes the move to Cedar if you’re still on Bamboo. If you’re running a multi-dyno app on a non-concurrent backend and haven’t received an email, drop us a line about Thin to Unicorn or Bamboo to Cedar.
  • 2X dynos: We’re fast-tracking the launch of 2X dynos, to provide double the memory and allow for double (or more) Unicorn concurrency for large Rails apps. This is already available in private beta in use by several hundred customers, and will be available in public beta shortly.
  • New visibility tools: We’re putting more focus on bringing you new performance visibility features, such as the log2viz dashboard, CPU and memory use logging, and HTTP request IDs. We’ll be working to do much more on this front to make sure that you can diagnose performance problems when they happen and know what to do about it.

Want something else not mentioned here? Let us know.

log2viz: Logs as Data for Performance Visibility

If you’re building a customer-facing web app or mobile backend, performance is a critical part of user experience. Fast is a feature, and affects everything from conversion rates to your site’s search ranking.

The first step in performance tuning is getting visibility into the app’s web performance in production. For this, we turn to the app’s logs.

Logs as data

There are many ways to collect metrics, the most common being direct instrumentation into the app. New Relic, Librato, and Hosted Graphite are cloud services that use this approach, and there are numerous roll-your-own options like StatsD and Metrics.

Another approach is to send metrics to the logs. Beginning with the idea that logs are event streams, we can use logs for a holistic view of the app: your code, and the infrastructure that surrounds it (such as the Heroku router). Mark McGranaghan’s Logs as Data and Ryan Daigle’s 5 Steps to Better Application Logging offer an overview of the logs-as-data approach.

Put simply, logs as data means writing semi-structured data to your app's logs via STDOUT. Then the logs can be consumed by one or more services to do dashboards, long-term trending, and threshold alerting.

The benefits of logs-as-data over direct instrumentation include:

  • No additional library dependencies for your app
  • No CPU cost to your dyno by in-app instrumentation
  • Introspection capability by reading the logs directly
  • Metrics backends can be swapped out without changes to app code
  • Possible to split the log stream and send it to multiple backends, for different views and alerting on the same data

Introducing log2viz, a public experiment

log2viz is an open-source demonstration of the logs-as-data concept for Heroku apps. Log in and select one of your apps to see a live-updating dashboard of its web activity.

For example, here’s a screenshot of log2viz running against the Rubygems Bundler API (written and maintained by Terence Lee, André Arko, and Larry Marburger, and running on Heroku):

log2viz gets all of its data from the Heroku log stream — the same data you see when running heroku logs --tail at the command line. It requires no changes to your app code and works for apps written in any language and web framework, demonstrating some of the benefits of logs as data.

Also introducing: log-runtime-metrics

In order to get memory use stats for your dynos, we’ve added a new experimental feature to Heroku Labs to log CPU and memory use by the dyno: log-runtime-metrics.

To enable this for your app (and see memory stats in log2viz), type the following:

$ heroku labs:enable log-runtime-metrics -a myapp
$ heroku restart

This inserts data into your logs like this:

heroku[web.1]: measure=load_avg_5m val=0.0
heroku[web.1]: measure=memory_total val=209.64 units=MB

log2viz reads these stats and displays average and max memory use across your dynos. (Like all Labs features, this is experimental and the format may change in the future.)

Looking under the hood

log2viz is open source. Let’s look at the code:

You can deploy your own copy of log2viz on Heroku, so fork away! For example, Heroku customer Timehop has experimented with trending graphs via Rickshaw.

Logs-as-data add-ons

log2viz isn't the only way to take advantage of your log stream for visibility on Heroku today. Here are a few add-ons which consume your app's logs.

Loggly offers a web console that lets you search your log history, and graph event types over time. For example, let’s search for status=404, logged by the Heroku router whenever your app serves a page not found:

Papertrail offers search archival and history, and can also alert on events when they pass a certain threshold. Here’s how you can set up an email alert every time your app experiences more than 10 H12 errors in a 60 second period. Search for the router log line:

Click “Save Search,” then:

Other add-ons that consume logs include Treasure Data and Logentries.

You can also use non-add-on cloud services, as shown in thoughtbot's writeup on using Splunk Storm with Heroku.

Conclusion

Visibility is a vast and challenging problem space. The logs-as-data approach is still young, and log2viz is just an experiment to get us started. We look forward to your feedback on log2viz, log visibility via add-ons, and your own experiments on performance visibility.

The Heroku Toolbelt

The Heroku Toolbelt is a package of the Heroku CLI, Foreman, and Git — all the tools you need to get started using Heroku at the command line. The Toolbelt is available as a native installer for OS X, Windows, and Debian/Ubuntu Linux.

The Toolbelt has been available since last fall as part of our polyglot platform. Since then it’s matured substantially with a huge amount of user testing, and now even has a shiny new landing page. Ruby developers can continue to use gem install heroku, but developers in other languages (Python, Java, Clojure, etc) will probably prefer not to have to install Ruby and RubyGems to use Heroku.

The installer won’t trample your existing install of Git if you have one. Similarly, although the Heroku CLI uses Ruby under the hood, the Toolbelt packaging isolates all of its libraries so it will not interfere with an existing Ruby setup.

The entire Toolbelt is open source. File an issue or, better yet, send a pull request if you see ways that it can be improved.

InfoWorld Names Heroku a 2012 Technology of the Year

InfoWorld has named Heroku as a 2012 Technology of the Year. While we’re not normally much for industry awards, we feel honored to be included alongside past winners such as the iPad, Android, Visual Studio, and Eclipse; and this year’s winners, including Amazon Web Services, Node.js, Hadoop, CloudBees, and Heroku add-on provider Rhomobile.

InfoWorld is a venerable publication in the technology world, and this is the first time they’ve given awards in the cloud space. We see this as another major point of validation for platform-as-a-service, and cloud technologies more generally. 2011 was the year that PaaS came into the greater collective consciousness of the technology industry. We can’t wait to see how things will unfold in 2012.

Scala on Heroku

The sixth official language on the Heroku polyglot platform is Scala, available in public beta on the Cedar stack starting today.

Scala deftly blends object-oriented programming with functional programming. It offers an approachable syntax for Java and C developers, the power of a functional language like Erlang or Clojure, and the conciseness and programmer-friendliness normally found in scripting languages such as Ruby or Python. It has found traction with big-scale companies like Twitter and Foursquare, plus many others. Perhaps most notably, Scala offers a path forward for Java developers who seek a more modern programming language.

More on those points in a moment. But first, let’s see it in action.

Scala on Heroku in Two Minutes

Create a directory. Start with this sourcefile:

src/main/scala/Web.scala

import org.jboss.netty.handler.codec.http.{HttpRequest, HttpResponse}
import com.twitter.finagle.builder.ServerBuilder
import com.twitter.finagle.http.{Http, Response}
import com.twitter.finagle.Service
import com.twitter.util.Future
import java.net.InetSocketAddress
import util.Properties

object Web {
  def main(args: Array[String]) {
    val port = Properties.envOrElse("PORT", "8080").toInt
    println("Starting on port:"+port)
    ServerBuilder()
      .codec(Http())
      .name("hello-server")
      .bindTo(new InetSocketAddress(port))
      .build(new Hello)
  }
}

class Hello extends Service[HttpRequest, HttpResponse] {
  def apply(req: HttpRequest): Future[HttpResponse] = {
    val response = Response()
    response.setStatusCode(200)
    response.setContentString("Hello from Scala!")
    Future(response)
  }
}

Add the following files to declare dependencies and build with sbt, the simple build tool for Scala:

project/build.properties

sbt.version=0.11.0

build.sbt

import com.typesafe.startscript.StartScriptPlugin

seq(StartScriptPlugin.startScriptForClassesSettings: _*)

name := "hello"

version := "1.0"

scalaVersion := "2.8.1"

resolvers += "twitter-repo" at "http://maven.twttr.com"

libraryDependencies ++= Seq("com.twitter" % "finagle-core" % "1.9.0", "com.twitter" % "finagle-http" % "1.9.0")

Declare how the app runs with a start script plugin and Procfile:

project/build.sbt

resolvers += Classpaths.typesafeResolver

addSbtPlugin("com.typesafe.startscript" % "xsbt-start-script-plugin" % "0.3.0")

Procfile

web: target/start Web

Commit to Git:

$ git init
$ git add .
$ git commit -m init

Create an app on the Cedar stack and deploy:

$ heroku create --stack cedar
Creating warm-frost-1289... done, stack is cedar
http://warm-frost-1289.herokuapp.com/ | git@heroku.com:warm-frost-1289.git
Git remote heroku added

$ git push heroku master
Counting objects: 14, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (14/14), 1.51 KiB, done.
Total 14 (delta 1), reused 0 (delta 0)

-----> Heroku receiving push
-----> Scala app detected
-----> Building app with sbt v0.11.0
-----> Running: sbt clean compile stage
       Getting net.java.dev.jna jna 3.2.3 ...
       ...
       [success] Total time: 0 s, completed Sep 26, 2011 8:41:10 PM
-----> Discovering process types
       Procfile declares types -> web
-----> Compiled slug size is 43.1MB
-----> Launching... done, v3
       http://warm-frost-1289.herokuapp.com deployed to Heroku

Then view your app on the web!

$ curl http://warm-frost-1289.herokuapp.com
Hello from Scala!

Dev Center: Getting Started with Scala on Heroku/Cedar

Language and Community

Scala is designed as an evolution of Java that addresses the verbosity of Java syntax and adds many powerful language features such as type inference and functional orientation. Java developers who have made the switch to Scala often say that it brings fun back to developing on the JVM. Boilerplate and ceremony are replaced with elegant constructs, to express intent in fewer lines of code. Developers get all the benefits of the JVM — including the huge ecosystem of libraries and tools, and a robust and performant runtime — with a language tailored to developer happiness and productivity.

Scala is strongly- and statically-typed, like Java (and unlike Erlang and Clojure). Its type inference has much in common with Haskell.

Yet, Scala achieves much of the ease of use of a dynamically-typed language (such as Ruby or Python). Though there are many well-established options for dynamically-typed open source languages, Scala is one of the few with compile-time type safety which is also both practical and pleasant to use. The static vs dynamic typing debate rages on, but if you’re in the type-safe camp, Scala is an obvious choice.

Language creator Martin Odersky’s academic background shines through in the feel of the language and the community. But the language’s design balances academic influence with approachability and pragmatism. The result is that Scala takes many of the best ideas from the computer science research world, and makes them practical in an applied setting.

Members of the Scala community tend to be forward-thinking, expert-level Java programmers; or developers from functional backgrounds (such as Haskell or ML) who see an opportunity to apply the patterns they love in a commercially viable environment.

There is some debate about whether Scala is too hard to learn or too complex. One answer is that the language is still young enough that learning resources aren’t yet fully-baked, although Twitter’s Scala School is one good resource for beginners. But perhaps Scala is simply a sharper tool than Java: in the hands of experts it’s a powerful tool, but copy-paste developers may find themselves with self-inflicted wounds.

Scala Days is the primary Scala conference, although the language is well-represented at cross-community conferences like Strange Loop.

The language community has blossomed, and is now in the process of accumulating more and more mainstream adoption. Community members are enthusiastic about the language’s potential, making for an environment that welcomes and encourages newcomers.

Open Source Projects

Open source is thriving in the Scala world. The Lift web framework is a well-known early mover, but the last two years have seen an explosion of new projects showcasing Scala’s strengths.

Finagle is a networking library coming out of the Twitter engineering department. It’s not a web framework in the sense of Rails or Django, but rather a toolkit for creating network clients and servers. The server builder is in some ways reminiscent of the Node.js stdlib for creating servers, but much more feature-full: fault-tolerance, backpressure (rate-limiting defense against attacks), and service discovery to name a few. The web is increasingly a world of connected services, and Finagle (and Scala) are a natural fit for that new order.

Spark runs on Mesos (a good example of hooking into the existing JVM ecosystem) to do in-memory dataset processing, such as this impressive demo of loading all of Wikipedia into memory for lightning-fast searches. Two other notable projects are Akka (concurrency middleware) and Play! (web framework), which we’ll look at shortly.

The Path Forward for Java?

Some Java developers have been envious of modern, agile, web-friendly languages like Ruby or Python — but they don’t want to give up type safety, the Java library ecosystem, or the JVM. Leaders in the Java community are aware of this stagnation problem and see alternate JVM languages as the path forward. Scala is the front-runner candidate on this, with support from influential people like Bruce Eckel, Dick Wall and Carl Quinn of the Java Posse, and Bill Venners.

Scala is a natural successor to Java for a few reasons. Its basic syntax is familiar, in contrast with Erlang and Clojure: two other functional, concurrency-focused languages which many developers find inscrutable. Another reason is that Scala’s functional and object-oriented mix allows new developers to build programs in an OO model to start with. Over time, they can learn functional techniques and blend them in where appropriate.

Working with Java libraries from Scala is trivial and practical. You can not only call Java libraries from Scala, but go the other way — provide Scala libraries for Java developers to call. Akka is one example of this.

There’s obvious overlap here between Scala as a reboot of the Java language and toolchain, and the Play! web framework as a reboot of Java web frameworks. Indeed, these trends are converging, with Play! 2.0 putting Scala front-and-center. The fact that Play! can be used in a natural way from both Java and Scala is another testament to JVM interoperability. Play 2.0 will even use sbt as the builder and have native Akka support.

Typesafe and Akka

Typesafe is a new company emerging as a leader in Scala, with language creator Martin Odersky and Akka framework creator Jonas Bonér as co-founders. Their open-source product is the Typesafe Stack, a commercially-supported distribution of Scala and Akka.

Akka is an event-driven middleware framework with emphasis on concurrency and scale-out. Akka uses the actor model with features such as supervision hierarchies and futures.

The Heroku team worked closely with Typesafe on bringing Scala to our platform. This collaboration produced items like the xsbt-start-script-plugin, and coordination around the release of sbt 0.11.

Havoc Pennington of Typesafe built WebWords, an excellent real-world demonstration of using Akka’s concurrency capabilities to scrape and process web pages. Try it out, then dig in on the sourcecode and his epic Dev Center article explaining the app’s architecture in detail. Havoc also gave an educational talk at Dreamforce about Akka, Scala, and Play!.

Typesafe: we enjoyed working with you, and look forward to more productive collaboration in the future. Thanks!

Conclusion

Scala’s explosive growth over the past two years is great news for both Java developers and for functional programming. Scala on Heroku, combined with powerful toolsets like Finagle and Akka, are a great fit for the emerging future of connected web services.

Further reading:

Special thanks to Havoc Pennington, Jeff Smick, Steve Jenson, James Ward, Bruce Eckel, and Alex Payne for alpha-testing and help with this post.

Python and Django on Heroku

Python has joined the growing ranks of officially-supported languages on Heroku’s polyglot platform, going into public beta as of today. Python is the most-requested language for Heroku, and it brings with it the top-notch Django web framework.

As a language, Python has much in common with Ruby, Heroku’s origin language. But the Python community has its own unique character. Python has a culture which finds an ideal balance between fast-moving innovation and diligent caution. It emphasizes readability, minimizes "magic," treats documentation as a first-class concern, and has a traditon of well-tested, backward-compatible releases in both the core language and its ecosystem of libraries. It blends approachability for beginners with maintainability for large projects, which has enabled its presence in fields as diverse as scientific computing, video games, systems automation, and the web.

Let’s take it for a spin on Heroku.

Heroku/Python Quickstart

Make a directory with three files:

app.py

import os
from flask import Flask
app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Python!"

if __name__ == "__main__":
    port = int(os.environ.get("PORT", 5000))
    app.run(host='0.0.0.0', port=port)

requirements.txt

Flask==0.7.2

Procfile

web: python app.py

Commit to Git:

$ git init
$ git add .
$ git commit -m "init"

Create an app on the Cedar stack and deploy:

$ heroku create --stack cedar
Creating young-fire-2556... done, stack is cedar
http://young-fire-2556.herokuapp.com/ | git@heroku.com:young-fire-2556.git
Git remote heroku added

$ git push heroku master
Counting objects: 5, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (5/5), 495 bytes, done.
Total 5 (delta 0), reused 0 (delta 0)

-----> Heroku receiving push
-----> Python app detected
-----> Preparing virtualenv version 1.6.1
       New python executable in ./bin/python2.7
       Also creating executable in ./bin/python
       Installing setuptools............done.
       Installing pip...............done.
-----> Installing dependencies using pip version 1.0.1
       Downloading/unpacking Flask==0.7.2 (from -r requirements.txt (line 1))
       ...
       Successfully installed Flask Werkzeug Jinja2
       Cleaning up...
-----> Discovering process types
       Procfile declares types -> web
-----> Compiled slug size is 3.5MB
-----> Launching... done, v2
       http://young-fire-2556.herokuapp.com deployed to Heroku

To git@heroku.com:young-fire-2556.git
 * [new branch]      master -> master

Then view your app on the web!

$ curl http://young-fire-2556.herokuapp.com/
Hello from Python!

Dev Center: Getting Started with Python on Heroku/Cedar

All About Python

Created by Guido van Rossum in 1991, Python is one of the world’s most popular programming languages, and finds application in a broad range of uses.

Cutting-edge communities, like Node.js and Ruby, encourage fast-paced innovation (though sometimes at the cost of application breakage). Conservative communities, like Java, favor a more responsible and predictable approach (though sometimes at the expense of being behind the curve). Python has managed to gracefully navigate a middle path between these extremes, giving it a respected reputation even among non-Python programmers. The Python community is an island of calm in the stormy seas of the programming world.

Python is known for its clearly-stated values, outlined in PEP 20, The Zen of Python. "Explicit is better than implicit" is one example (and a counterpoint to "Convention over configuration" espoused by Rails). "There’s only one way to do it" is another (counterpointing "There’s more than one way to do it" from Perl). See Code Like a Pythonista: Idiomatic Python for more.

The Python Enhancement Proposal (PEP) brings a structured approach to extending the core language design over time. It captures much of the value of Internet standard bodies procedures (like Internet Society RFCs or W3C standards proposals) without being as heavy-weight or resistant to change. Again, Python finds a graceful middle path: neither changing unexpectedly at the whim of its lead developers, nor unable to adapt to a changing world due to too many approval committees.

Documentation is one of Python’s strongest areas, and especially notable because docs are often a second-class citizen in other programming languages. Read the Docs is an entire site dedicated to packaging and documentation, sponsored by the Python Software Foundation. And the Django book defined a whole new approach to web-based publishing of technical books, imitated by many since its release.

Frameworks and the Web

In some ways, Python was the birthplace of modern web frameworks, with Zope and Plone. Concepts like separation of business and display logic via view templating, ORMs for database interaction, and test-driven development were built into Zope half a decade before Rails was born. Zope never had the impact achieved by the later generation of frameworks, partially due to its excessive complexity and steep learning curve, and partially due to simply being ahead of its time. Nevertheless, modern web frameworks owe much to Zope’s pioneering work.

The legacy of Zope’s checkered history combined with the Python community’s slow recognition of the importance of the web could have been a major obstacle to the language’s ongoing relevance with modern developers, who increasingly wanted to build apps for the web. But in 2005, the Django framework emerged as a Pythonic answer to Rails. (Eventually, even Guido came around.)

Django discarded the legacy of past Python web implementations, creating an approachable framework designed for rapid application development. Django’s spirit is perhaps best summarized by its delightful slogan: "the web framework for perfectionists with deadlines." Where Rails specializes on CRUD applications, Django is best known for its CMS capabilities. It has an emphasis on DRY (Don’t Repeat Yourself). The Django community prefers to create reusable components or contribute back to existing projects over single-use libraries, which helps push the greater Python community forward. While Django is a batteries-included framework, the loose coupling of components allows flexibility and choice.

Other frameworks have found traction as well. Flask, a Sinatra-like microframework, makes use of Python’s decorators for readability. Pyramid emerged from the earlier Pylons and TurboGears projects, and their documentation already offers excellent instructions for deploying to Heroku.

Similarly, Python established a pattern for webserver adapters with WSGI. Many other languages have since followed suit, such as Rack for Ruby, Ring for Clojure, and PSGI/Plack for Perl.

In the Wild

Perhaps most striking about Python is the breadth of different realms it has taken root in. A few examples:

  • Science and math computing, evidenced by books and the SciPy libraries and conferences.
  • Video games, as seen in libraries such as PyGame and Cocos2d.
  • As an embedded scripting / extension language, in software such as Blender3D, Civilization IV, and EVE Online (via Stackless Python).
  • Major Linux distributions use Python for their system tools, such as yum and the Red Hat Network client for Red Hat and Fedora; or almost all of the GUI configuration and control panels on Ubuntu.
  • It’s one of the three official languages used by Google, alongside Java and C++.
  • And of course, internet startups: Reddit, Youtube, Disqus, Dropbox, and countless others use Python to build their businesses.

Conclusion

We anticipate that Python will be one of the most-used languages on the Heroku platform, and are overjoyed to welcome our Python brothers and sisters into the fold.

Special thanks to all the members of the Python community that helped with alpha testing, feedback, and patches on Heroku’s Python support, including: David Cramer, Ben Bangert, Kenneth Love, Armin Ronacher, and Jesse Noller.

We’ll be sponsoring and speaking at PyCodeConf next week. Come chat with us about what you’d like to see out of Python on Heroku!

Further reading:

Heroku for Java

We’re pleased to announce the public beta of Heroku for Java. Java is the fourth official language available on the Cedar stack.

Java is, by many measures, the world’s most popular programming language. In addition to its large and diverse developer base, it offers a huge ecosystem of libraries and tools, an extremely well-tuned VM for fast and reliable runtime performance, and an accessible C-like syntax.

But there are also many criticisms commonly leveled against the language. We’ll take a closer look at Java’s strengths and weaknesses in a moment, but first:

Heroku for Java in 2 minutes

Create a project with three files:

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" 
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.example</groupId>
    <version>1.0-SNAPSHOT</version>
    <artifactId>helloworld</artifactId>
    <dependencies>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-servlet</artifactId>
            <version>7.6.0.v20120127</version>
        </dependency>
        <dependency>
            <groupId>javax.servlet</groupId>
            <artifactId>servlet-api</artifactId>
            <version>2.5</version>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-dependency-plugin</artifactId>
                <version>2.4</version>
                <executions>
                    <execution>
                        <id>copy-dependencies</id>
                        <phase>package</phase>
                        <goals><goal>copy-dependencies</goal></goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

src/main/java/HelloWorld.java

import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.http.*;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.servlet.*;

public class HelloWorld extends HttpServlet {

    @Override
    protected void doGet(HttpServletRequest req, HttpServletResponse resp)
            throws ServletException, IOException {
        resp.getWriter().print("Hello from Java!\n");
    }

    public static void main(String[] args) throws Exception{
        Server server = new Server(Integer.valueOf(System.getenv("PORT")));
        ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
        context.setContextPath("/");
        server.setHandler(context);
        context.addServlet(new ServletHolder(new HelloWorld()),"/*");
        server.start();
        server.join();   
    }
}

Procfile

web:    java -cp target/classes:target/dependency/* HelloWorld

Commit these files to Git:

$ git init
$ git add .
$ git commit -m init

Create an app on the Cedar stack and deploy. Your Java program and all its dependencies will be built at slug compile time:

$ heroku create --stack cedar
Creating hollow-dawn-737... done, stack is cedar
http://hollow-dawn-737.herokuapp.com/ | git@heroku.com:hollow-dawn-737.git
Git remote heroku added

$ git push heroku master
Counting objects: 9, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (5/5), done.
Writing objects: 100% (9/9), 1.36 KiB, done.
Total 9 (delta 0), reused 0 (delta 0)

-----> Heroku receiving push
-----> Java app detected
-----> Installing Maven 3.0.3..... done
-----> Installing settings.xml..... done
-----> executing .maven/bin/mvn -B -Duser.home=/tmp/build_yiuhjlk5iqs4 -s .m2/settings.xml -DskipTests=true clean install
       [INFO] Scanning for projects...
       [INFO]                                                                         
       [INFO] ------------------------------------------------------------------------
       [INFO] Building helloworld 1.0-SNAPSHOT
       [INFO] ------------------------------------------------------------------------
       ...
       [INFO] ------------------------------------------------------------------------
       [INFO] BUILD SUCCESS
       [INFO] ------------------------------------------------------------------------
       [INFO] Total time: 5.377s
       [INFO] Finished at: Mon Aug 22 16:35:58 UTC 2011
       [INFO] Final Memory: 12M/290M
       [INFO] ------------------------------------------------------------------------
-----> Discovering process types
       Procfile declares types -> web
-----> Compiled slug size is 13.6MB
-----> Launching... done, v3
       http://hollow-dawn-737.herokuapp.com deployed to Heroku

Then view your app on the web:

$ curl http://hollow-dawn-737.herokuapp.com
Hello from Java!

For more detail see:

Why Java?

Java is a solid language for building web apps:

  • The JVM is one of the best runtime VMs in the world, offering fast performance and a reliable memory footprint over time.
  • Java boasts an estimated population of six million developers, with a vast ecosystem of tools, libraries, frameworks and literature. It is the most mature and established programming language for building server-side applications in existence today.
  • Born at the beginning of the Internet age, Java began with the goal of "write once, run anywhere." Though it took a long time to get there, this goal has been largely achieved. The universal JVM runtime environment is available on an incredibly wide range of platforms and offers near-perfect portability between those platforms with no changes in application code, and even build artifacts are binary-compatible.

Despite these strengths, Java faces criticism from many sides. Partially, this is an inescapable effect of popularity. But many of these criticisms are valid, and reflect the downside of being a mature community: substantial legacy baggage.

To understand this better, we need to tease apart Java (the programming language) from J2EE (the "enterprise" application container interface).

How J2EE Derailed Java

Java took off as a server-side programming language with the emergence of the JDBC and Servlet APIs in the late 1990s. Since then a vast number of web applications have been built using these basic APIs combined with other technologies like JSP, JSF, Struts, Spring and more. The emergence of J2EE and J2EE application servers boosted Java’s presence in the enterprise and created a lucrative software segment of J2EE middleware vendors. But it also added complexity to applications and deployment processes.

J2EE was built for a world of application distribution — that is, software packaged to be run by others, such as licensed software. But it was put to use in a world of application development and deployment — that is, software-as-a-service. This created a perpetual impedance mismatch between technology and use case. Java applications in the modern era suffer greatly under the burden of this mismatch.

As one illustration, consider the J2EE Development Roles document. It suggests an assembly-line model for development and deployment of apps, with the code passing along a chain of eight different people. This was a fairly complex and bureaucratic model that didn’t match how software was developed a decade ago, let alone today.

In Stop Wasting Money On WebLogic, WebSphere, And JBoss Application Servers, Forrester analyst Mike Gualtieri writes:

Traditional application servers or containers such as Tomcat will fast become legacy methods for deploying Java applications. The next generation is elastic application platforms (EAP) that are containerless.

In recent years, J2EE vendors have attempted to fix the problems (including a re-branding from J2EE to JEE). Unfortunately, it was too little too late. The Java space is now ripe for disruptive innovation by cloud application platforms.

Heroku for Java

If you’ve worked with Java before, the content of the hello-world sample app shown above may have surprised you. There is no "application container" in the J2EE sense; the app uses Jetty as an embedded webserver, just as one might use Unicorn for Ruby or Tornado for Python, or Jetty itself for Clojure.

The capabilities promised by J2EE application containers for managing your app include deployment, restart, logging, service binding (config), and clustering (horizontal scaling). Running your Java app on Heroku, you achieve these ends via the platform instead.

But unlike J2EE, Heroku is a polyglot platform. Techniques for deployment, logging, and scaling are applicable to all app deployments, regardless of language. A common deployment infrastructure reduces language choice to just a question of syntax and libraries. Reduced coupling between app and infrastructure enables picking the right language for each job.

A New Era for Software Delivery

Using Heroku’s platform to run Java apps finally solves the impedance mismatch between application containers designed for traditional software distribution, and the modern world of software-as-a-service.

In the classic software delivery process (development → packaging → distribution → install → deployment), code passes through many hands before it finally reaches the end user. Developers build, QA verifies, ops deploys, and finally end users can access. In this environment, the feedback loop for information about how code behaves in production is slow and inefficient — it may take weeks or months for this to make it back to developers, and often in a highly-filtered format.

Heroku is built for the new era of software-as-a-service. An app is built by a small, cross-functional, relatively independent team which builds and deploys everything itself, with few or no hand-offs to other teams. There is no packaging, distribution, or install element because the code never leaves the team/organization. This keeps developers who build the software in close touch with how it behaves in production. And it enables continuous delivery, for a tight feedback loop between customer needs and resulting software built for those needs.

Java teams are often still stuck with the classic process because it’s built into the toolchain. Heroku for Java is optimized for compact applications that require robust, yet agile deployment and rapid iterations. You can deploy any Java application to Heroku, including J2EE applications, but you aren’t constrained by the J2EE deployment process.

Other JVM Languages

This announcement is official support for Java the language, but developers familiar with the JVM have already seen that it’s possible to deploy any other JVM-based language, by bootstrapping with pom.xml. The JVM is becoming popular as the runtime VM for both new and existing languages, so Java support on Heroku makes it much easier to bootstrap into running any JVM language on our platform.

For example, JRuby is one of the most frequently-requested languages on Heroku. Matthew Rodley has already put a Rails app onto JRuby on Heroku by adding JRuby to pom.xml. Scala, another common request, could be done the same way. We do look forward to being able to offer the same kind of first-class support for JRuby and Scala that we offer for Clojure; but in the meantime, bootstrapping via Java is a reasonable strategy.

Learning From Each Other

With the rise of polyglot programming, cross-talk between language communities has become much more common. Heroku’s polyglot platform further reinforces that trend.

Younger language communities have much they can learn from a mature community like Java. For example, Java began working on build automation and dependency management (via Ant, and later Maven) long before Ruby/Rails got Gem Bundler, Python got Pip, or Clojure got Leiningen. These tools owe much of their underlying theory to the learning (and battle scars) accumulated by Java build tools.

At the same time, Java has much it can learn from younger languages which are unencumbered by legacy baggage. Patterns, frameworks, and build systems in newer languages are already optimized for cloud application deployment with no left-over cruft from past eras. Java has already borrowed ideas in the framework space — see Play! or Grails for two examples. But sharing common deployment infrastructure between languages opens up the possibility for Java developers to get more exposure to deployment and scaling best practices from other communities.

The Future

Java is another milestone on the polyglot platform path, but there’s more to come. Future language packs will span the gamut from venerable (like Java) to cutting-edge (like Clojure and Node.js) to squarely in-between (like Ruby). Our desire is to be as inclusive as possible. Choice of language is up to the developer.

Heroku is driven by a simple first principle: do what’s best for developers. Supporting Java is what’s best for the large world of Java developers; it’s what’s best for developers who want to use other JVM languages; and it’s even good for users of other languages, who will benefit indirectly from the learning their community may gain from contact with Java. We’re pleased to welcome Java developers to Heroku.

Get Going

Ready to get started building Java apps on Heroku? Start with these articles in the Dev Center:

Clojure on Heroku

We’re very excited to announce official support for Clojure, going into public beta as of today. Clojure is the third official language supported by Heroku, and is available on the Cedar stack.

Clojure is a Lisp-like functional programming language which runs on the Java Virtual Machine (JVM). It offers powerful concurrency primitives based on immutable data structures, with emphasis on composability and correctness. The Clojure community is vibrant and growing quickly.

More about Clojure in a moment, but first:

Clojure on Heroku in 2 minutes

Create a project with three files:

project.clj

(defproject hello-world "0.0.1"
  :dependencies
    [[org.clojure/clojure "1.2.1"]
     [ring/ring-jetty-adapter "0.3.9"]])

src/demo/web.clj

(ns demo.web
  (:use ring.adapter.jetty))

(defn app [req]
  {:status 200
   :headers {"Content-Type" "text/plain"}
   :body "Hello from Clojure!\n"})

(defn -main []
  (let [port (Integer/parseInt (System/getenv "PORT"))]
    (run-jetty app {:port port})))

Procfile

web: lein run -m demo.web

Commit to Git:

$ git init
$ git add .
$ git commit -m init

Create an app on the Cedar stack and deploy. Your Clojure program and all its dependencies will be built at slug compile time:

$ heroku create --stack cedar
Creating young-earth-944... done, stack is cedar
http://young-earth-944.herokuapp.com/ | git@heroku.com:young-earth-944.git
Git remote heroku added

$ git push heroku master
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (7/7), 714 bytes, done.
Total 7 (delta 0), reused 0 (delta 0)

-----> Heroku receiving push
-----> Clojure app detected
-----> Installing Leiningen
       Downloading: leiningen-1.5.2-standalone.jar
       Downloading: rlwrap-0.3.7
       Writing: lein script
-----> Installing dependencies with Leiningen
       Running: lein deps :skip-dev
       Downloading: org/clojure/clojure/1.2.1/clojure-1.2.1.pom from central
       ...
       Copying 11 files to /tmp/build_1cplwgglcalfm/lib
-----> Discovering process types
       Procfile declares types -> web
-----> Compiled slug size is 11.1MB
-----> Launching... done, v2
       http://young-earth-944.herokuapp.com deployed to Heroku

Launch a web process:

$ heroku scale web=1
Scaling web processes... done, now running 1

Then view your app on the web!

$ curl http://young-earth-944.herokuapp.com
Hello from Clojure!

Dev center: Getting Started With Clojure on Heroku/Cedar

Why Clojure?

Clojure combines the expressiveness of Lisp, the agility of a dynamic language, the performance of a compiled language, and the wide applicability of the JVM in a robust, production-ready package. Clojure is a practical language designed to support high-performance, concurrent applications which efficiently interoperate with other software in the JVM ecosystem. All of this combines to make it an ideal tool for the programmer to quickly build robust programs.

Simplicity and Composability

Clojure is known for its simplicity. Simple, in its original sense, means: single-purpose. Functions and language constructs have exactly one purpose, such as atoms for synchronous, independent state change, or protocols for polymorphism (but not any additional behaviour like encapsulation or inheritance). Single-purpose functions and confidence in small pieces of code working well on their own naturally lead to highly composable libraries. This form of simplicity is at the heart of idiomatic Clojure.

Emphasis on Correctness

The simplicity of Clojure makes it easier to "reason about correctness" – that is, look at a piece of code and be able to understand every possible effect it may have. Clojure emphasizes correctness and carefulness, reminiscent of a statically-typed language.

Like all functional programming languages, Clojure enforces being explicit about change and minimizing side-effects. The careful methodology can be seen in Clojure’s immutable data structures. If you want to change something, you need to wrap it in an atom, ref, agent or other concurrency primitive. You need to explicitly define what is changeable and how that change is safely managed.

Libraries and Tools

Clojure is a young language, and normally this would mean that it lacks supporting libraries. Clojure’s community got a jumpstart by running on the JVM, providing native access to the rich world of Java libraries available.

In addition, the vibrant community of developers working on Clojure has taken things further by developing good tools right from the get-go. Ruby had to wait a long time before getting tools like Gem Bundler, Gemcutter, and RVM; even Rack is fairly recent. In Clojure, equivalent tools like Leiningen, Clojars, and Ring have been there from an early age, allowing Clojure to progress from a well-rounded foundation.

Clojure is very open to incorporating the best ideas from other languages. For example, the core.logic library borrows heavily from Prolog, while Incanter is based on R. The creator of Clojure, Rich Hickey, drew inspiration from a number of other languages, shown nicely on his Clojure bookshelf.

Further Reading: Rationale, from the Clojure official website

Why Clojure on Heroku?

There are three reasons why we chose Clojure as the next available language for the Heroku platform:

  1. New use cases
  2. Clojure’s still-evolving community
  3. The Heroku team loves Clojure

1. New Use Cases

Heroku believes in using the right tool for the job. We extend this philosophy to programming languages as well. Software systems have become more complex and powerful, and simultaneously special-purpose tools like Node.js and Clojure become more accessible. These two factors together mean that it increasingly makes sense to choose the programming language for a particular app based on the job at hand.

Ruby, Javascript, and Clojure are all general-purpose languages, but they each excel at certain use cases. Ruby’s highly dynamic nature and emphasis on beauty makes it a natural fit for user-facing web apps. Node.js’s evented concurrency makes it a great fit for the realtime web. Clojure covers a new use case on the Heroku platform: components which demand correctness, performance, composability; and optionally, access to the Java ecosystem.

Open source / tool examples:

  • Pallet, cloud automation utilizing the JClouds library.
  • Cascalog, a Clojure-based query language for Hadoop.
  • Incanter, a platform for statistical analysis.
  • FleetDB, a NoSQL database.

Companies using Clojure:

  • Relevance is a development shop, Heroku partner, and home to Clojure/core. In addition to their heavy involvement in the Clojure community, they use the language for suitable projects. For example, this client’s description of their use of Clojure for a rule-processing engine.
  • FlightCaster, which uses Clojure to interface to Hadoop for machine learning.
  • Pulse is an internal real-time metrics tool for the Heroku platform kernel. It’s a distributed, multi-process-type Clojure application that heavily uses several key aspects of Clojure: functional data processing, concurrency, JVM platform and library support. It runs as a Heroku app with no special privileges.
  • Many other companies, ranging from startups like BankSimple to established companies like Akamai, are using Clojure.

These examples show how Clojure support may lead to increased variety of apps deployed the Heroku platform. While it’s entirely possible to write a statistical-analysis package in Ruby or a metrics-processing tool in Node.js, Clojure will often be a better fit for these cases and others.

2. Clojure’s Still-evolving Community

The Clojure community grew out of the shared goal of developing a modern and forward-looking yet practical and performant programming language. Developers are drawn to Clojure by its elegant design and robust practicality.

Though growing quickly, the Clojure community is small enough to be approachable and accepting of new ideas. This is crucial for a platform like Heroku, which offers a deployment workflow that is a radical departure from that used for server-based deployments. Language communities with heavy investment in traditional deployment methods will be harder to adapt to the Heroku way. Like Node.js, or Ruby in 2009, Clojure’s small but fast-growing community means there’s an opportunity for the Heroku platform and the Clojure community to work together on evolving the best-practices for Clojure deployment.

3. The Heroku Team Loves Clojure

Heroku has always focused on languages that we ourselves use and love. Ruby was the first, and Javascript/Node.js was the second. Clojure is a language that is rapidly growing in use and esteem on our engineering team.

Mark McGranaghan, lead engineer on Heroku’s platform infrastructure, first brought Clojure to Heroku. As the author of Ring (a Rack- or WSGI-like adapter for web apps), he’s an active member of the Clojure community and has been a strong voice for our support of Clojure since he joined our team.

We want to deliver a platform that offers an end-to-end developer experience that feels right. "Feels right" is an attribute that can only be judged by developers who use the language in question on a daily basis and belong to that language’s community. We use and love Clojure, and that means we can use our own first-hand judgement on the "feels right" attribute of Heroku’s Clojure support.

Get Going

Ready to start building Clojure apps on Heroku? Start with these articles in the Heroku Dev Center:

Further reading on Clojure in general:


Special thanks to James Reeves, Phil Hagelberg, and Chris Redinger for alpha-testing Clojure on Heroku and contributing to this post.

The New Heroku (Part 4 of 4): Erosion-resistance & Explicit Contracts

In 2006, I wrote Catapult: a Quicksilver-inspired command-line for the web. I deployed it to a VPS (Slicehost), then gave the URL out to a few friends. At some point I stopped using it, but some of my friends remained heavy users. Two years later, I got an email: the site was down.

Logging into the server with ssh, I discovered many small bits of breakage:

  • The app’s Mongrel process had crashed and not restarted.
  • Disk usage was at 100%, due to growth of logfiles and temporary session data.
  • The kernel, ssh, OpenSSL, and Apache needed critical security updates.

The Linux distro had just reached end-of-life, so the security fixes were not available via apt-get. I tried to migrate to a new VPS instance with an updated operating system, but this produced a great deal more breakage: missing Ruby gems, hardcoded filesystem paths in the app which had changed in the new OS, changes in some external tools (like ImageMagick). In short, the app had decayed to a broken state, despite my not having made any changes to the app’s code. What happened?

I had just experienced a powerful and subtle force known as software erosion.

Software Erosion is a Heavy Cost

Wikipedia says software erosion is "slow deterioration of software over time that will eventually lead to it becoming faulty [or] unusable" and, importantly, that "the software does not actually decay, but rather suffers from a lack of being updated with respect to the changing environment in which it resides." (Emphasis added.)

If you’re a developer, you’ve probably built hobby apps, or done small consulting projects, that resulted in apps like Catapult. And you’ve probably experienced the pain of minor upkeep costs over time, or eventual breakage when you stop paying those upkeep costs.

But why does it matter if hobby apps break?

Hobby apps are a microcosm which illustrate the erosion that affects all types of apps. The cost of fighting erosion is highest on production apps — much higher than most developers realize or admit. In startups, where developers tend to handle systems administration, anti-erosion work is a tax on their time that could be spent building features. On more mature projects, dedicated sysadmins spend a huge portion of their time fighting erosion: everything from failed hardware to patching kernels to updating entire OS/distro versions.

Reducing or eliminating the cost of fighting software erosion is of huge value, to both small hobby or prototype apps, and large production apps.

Heroku, the Erosion-resistant Platform

Heroku’s new runtime stack, Celadon Cedar, makes erosion-resistance a first-class concern.

This is not precisely a new feature. Rather, it is a culmination of what we’ve learned over the course of three years of being responsible for the ongoing upkeep of infrastructure supporting 150k apps. While all of our runtime stacks offer erosion-resistance to some degree, Cedar takes it to a new level.

The evidence that Heroku is erosion-resistant can be found in your own Heroku account. If you’re a longtime Heroku user, type heroku apps, find your oldest app, and try visiting it on the web. Even if you haven’t touched it in years, you’ll find that (after a brief warm-up time) it comes up looking exactly as it did the last time you accessed it. Unlike an app running on a VPS or other server-based deploy, the infrastructure on which your app is running has been updated with everything from kernel security updates to major overhauls in the routing and logging infrastructure. The underlying server instances have been destroyed many times over while your app’s processes have been seamlessly moved to new and better homes.

How Does Erosion-resistance Work?

Erosion-resistance is an outcome of strong separation between the app and the infrastructure on which it runs.

In traditional server-based deployments, the app’s sourcecode, config, processes, and logs are deeply entangled with the underlying server setup. The app touches the OS and network infrastructure in a hundred implicit places, from system library versions to hardcoded IP addresses and hostnames. This makes anti-erosion tasks like moving the app to a new cluster of servers a highly manual, time-consuming, and error-prone procedure.

On Heroku, the app and the platform it runs on are strongly separated. Unlike a Linux or BSD distribution, which gets major revisions every six, twelve, or eighteen months, Heroku’s infrastructure is improving continuously. We’re making things faster, more secure, more robust against failure. We make these changes on nearly a daily basis, and we can do so with the confidence that this will not disturb running apps. Developers on those apps need not know or care about the infrastructure changes happening beneath their feet.

How do we achieve strong separation of app and infrastructure? This leads us to the core principle that underlies erosion-resistance and much of the value of the platform deployment model: explicit contracts.

Explicit Contract Between the App and the Platform

Preventing breakage isn’t a matter of never changing anything, but of changing in ways that don’t break explicit contracts between the application and the platform. Explicit contracts are how we can achieve almost 100% orthogonality between the app (allowing developers to change their apps with complete freedom) and the platform (allowing Heroku to change the infrastructure with almost complete freedom). As long as both parties adhere to the contract, both have complete autonomy within their respective realms.

Here are some of the contracts between your app running on the Cedar stack and the Heroku platform infrastructure:

  • Dependency management – You declare the libraries your app depends on, completely and exactly, as part of your codebase. The platform can then install these libraries at build time. In Ruby, this is accomplished with Gem Bundler and Gemfile. In Node.js, this is accomplished with NPM and package.json.
  • Procfile – You declare how your app is run with Procfile, and run it locally with Foreman. The platform can then determine how to run your app and how to scale out when you request it.
  • Web process binds to $PORT – Your web process binds to the port supplied in the environment and waits for HTTP requests. The platform thus knows where to send HTTP requests bound for your app’s hostname.
  • stdout for logs – You app prints log messages to standard output, rather than framework-specific or app-specific log paths which would be difficult or impossible for the platform to guess reliably. The platform can then route those log streams through to a central, aggregated location with Logplex.
  • Resource handles in the environment – Your app reads config for backing services such as the database, memcached, or the outgoing SMTP server from environment variables (e.g. DATABASE_URL), rather than hardcoded constants or config files. This allows the platform to easily connect add-on resources (when you run heroku addons:add) without needing to touch your code.

These contracts are not only explicit, but designed in such a way that they shouldn’t have to change very often.

Furthermore, these contracts are based on language-specific standards (e.g., Bundler/NPM) or time- proven unix standards (e.g. port binding, environment variables) whenever possible. Well-written apps are likely already using these contracts or some minor variation on them.

An additional concern when designing contracts is avoiding designs that are Heroku-specific in any way, as that would result in vendor lock-in. We invest heavily in ensuring portability for your apps and data, as it’s one of our core principles.

Properly designed contracts offer not only strong separation between app and platform, but also easy portability between platforms, or even between a platform and a server-based deployment.

Conclusion

Erosion is a problem; erosion-resistance is the solution. Explicit contracts are the way to get there.

Heroku is committed to keeping apps deployed to our platform running, which means we’re fighting erosion on your behalf. This saves you and your development team from the substantial costs of the anti-erosion tax. Cedar is our most erosion-resistant stack yet, and we look forward to seeing it stand the test of time.


Other Posts From This Series

The New Heroku (Part 1 of 4): The Process Model & Procfile

In the beginning was the command line. The command line is a direct and immediate channel for communicating with and controlling a computer. GUIs and menus are like pointing and gesturing to communicate; whereas the command line is akin to having a written conversation, with all the nuance and expressiveness of language.

This is not lost on developers, for whom the command prompt and blinking cursor represents the potential to run anything, to do anything. Developers use the command line for everything from starting a new project (rails new) to managing revisions (git commit) to launching secondary, more specialized command lines (psql, mysql, irb, node).

With Celadon Cedar, Heroku’s new runtime stack, the power and expressiveness of the command line can be scaled out across a vast execution environment. Heroku provides access to this environment through an abstraction called the process model. The process model maps command-line commands with to app code, creating a collection of processes which together form the running app.

But what does this mean for you, as an app developer? Let’s dive into the details of the process model, and see how it offers a new way of thinking about how to build and run applications.

The Foundation: Running a One-Off Process

The simplest manifestation of the process model is running a one-off process. On your local machine, you can cd into a directory with your app, then type a command to run a process. On Heroku’s Cedar stack, you can use heroku run to launch a process against your deployed app’s code on Heroku’s execution environment (known as the dyno manifold).

A few examples:

$ heroku run date
$ heroku run curl http://www.google.com/
$ heroku run rails console
$ heroku run rake -T
$ heroku run rails server

At first glance, heroku run may seem similar to ssh; but the only resemblance is that the command specified is being run remotely. In contrast to ssh, each of these commands is run on a fresh, stand-alone dyno running in different physical locations. Each dyno is fully isolated, starts up with a pristine copy of the app’s compiled filesystem, and entire dyno (including process, memory, filesystem) is discarded when the process launched by the command exits or is terminated.

The command heroku run rails server launches a webserver process for your Rails app. Running a webserver in the foreground as a one-off process is not terribly useful: for general operation, you want a long-lived process that exists as a part of a fleet of such processes which do the app’s business. To achieve this, we’ll need another layer on top of the single-run process: process types, and a process formation.

Defining an App: Process Types via Procfile

A running app is a collection of processes. This is true whether you are running it on your local workstation, or as a production deploy spread out across many physical machines. Historically, there has been no single, language-agnostic, app-centric method for defining the processes that make up an app. To solve this, we introduce Procfile.

Procfile is a format for declaring the process types that describe how your app will run. A process type declares its name and a command-line command: this is a prototype which can be instantiated into one or more running processes.

Here’s a sample Procfile for a Node.js app with two process types: web (for HTTP requests), and worker (for background jobs).

Procfile

web:     node web.js
worker:  node worker.js

In a local development environment, you can run a small-scale version of the app by launching one process for each of the two process types with Foreman:

$ gem install foreman
$ foreman start
10:14:40 web.1     | started with pid 13998
10:14:40 worker.1  | started with pid 13999
10:14:41 web.1     | Listening on port 5000
10:14:41 worker.1  | Worker ready to do work

The Heroku Cedar stack has baked-in support for Procfile-backed apps:

$ heroku create --stack cedar
$ git push heroku master
...
-----> Heroku receiving push
-----> Node.js app detected
...
-----> Discovering process types
       Procfile declares types -> web, worker

This Procfile-backed app is deployed to Heroku. Now you’re ready to scale out.

Scaling Out: The Processes Formation

Running locally with Foreman, you only need one process for each process type. In production, you want to scale out to much greater capacity. Thanks to a share-nothing architecture, each process type can be instantiated into any number of running processes. Each process of the same process type shares the same command and purpose, but run as separate, isolated processes in different physical locations.

Cedar provides the heroku scale command to make this happen:

$ heroku scale web=10 worker=50
Scaling web processes... done, now running 10
Scaling worker processes... done, now running 50

Like heroku run, heroku scale launches processes. But instead of asking for a single, one-shot process attached to the terminal, it launches a whole group of them, starting from the prototypes defined in your Procfile. The shape of this group of running processes is known as the process formation.

In the example above, the process formation is ten web processes and fifty worker processes. After scaling out, you can see the status of your new formation with the heroku ps command:

$ heroku ps
Process       State               Command
------------  ------------------  ------------------------------
web.1         up for 2s           node web.js
web.2         up for 1s           node web.js
...
worker.1      starting for 3s     node worker.js
worker.2      up for 1s           node worker.js
...

The dyno manifold will keep these processes up and running in this exact formation, until you request a change with another heroku scale command. Keeping your processes running indefinitely in the formation you’ve requested is part of Heroku’s erosion-resistance.

Run Anything

The process model, heroku run, and heroku scale open up a whole new world of possibilities for developers like you working on the Heroku platform.

A simple example: swap out the webserver and worker system used for your Rails app (Heroku defaults to Thin and Delayed Job), and use Unicorn and Resque instead:

Gemfile

gem 'unicorn'
gem 'resque'
gem 'resque-scheduler'

Procfile

web:     bundle exec unicorn -p $PORT -c ./config/unicorn.rb
worker:  bundle exec rake resque:work QUEUE=*

For background work, you can run different types of workers consuming from different queues. Add a clock process as a flexible replacement for cron using resque-scheduler:

Procfile

worker:    bundle exec rake resque:work QUEUE=*
urgworker: bundle exec rake resque:work QUEUE=urgent
clock:     bundle exec resque-scheduler

Goliath is an innovative new EventMachine-based evented webserver. Write a Goliath-based app and you’ll be able to run it from your Procfile:

Gemfile

gem 'goliath'

Procfile

web: bundle exec ruby hello_goliath.rb -sv -e prod -p $PORT

Or how about a Node.js push-based pubsub system like Juggernaut or Faye?

package.json

{
  "name": "myapp",
  "version": "0.0.1",
  "dependencies": {
    "juggernaut": "2.0.5"
  }
}

Procfile

web: node_modules/.bin/juggernaut

This is just a taste of what you can do with Procfile. The possibilities are nearly limitless.

Conclusion

The command line is a simple, powerful, and time-honored abstraction. Procfile is a layer on top of the command line for declaring how your app gets run. With Cedar, heroku scale becomes your distributed process manager, and heroku run becomes your distributed command line.

We’ve only just seen the beginning of what the process model can do: over the next year, Heroku will be adding new language runtimes, new routing capabilities, and new types of add-on resources. The sky’s the limit, and we can’t wait to see what inventive new kinds of apps developers like you will be building.


Other Posts From This Series