DGC IV: Confluence Upgrades

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at Confluence upgrades.

Confluence Release History and Track Record

I started using Confluence at around version 2.4.4 (released March 2007). A lot has changed since then, mostly for better. In my early days, Atlassian was spitting out one release after another — typically 3 weeks or less apart — followed by a major release every 3 months. You can check out the full release history on their wiki.

This changed later on and recently there have been fewer minor releases and bigger major releases delivered 3.5-4 months. Depending on your point of view this is good or bad. It now takes longer to get awaited features and fixes, but on the other hand the releases are more solid and better tested.

For major releases, Atlassian

Continue reading “DGC IV: Confluence Upgrades”

DGC IV: Confluence Upgrades

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at Confluence upgrades.

Confluence Release History and Track Record

I started using Confluence at around version 2.4.4 (released March 2007). A lot has changed since then, mostly for better. In my early days, Atlassian was spitting out one release after another — typically 3 weeks or less apart — followed by a major release every 3 months. You can check out the full release history on their wiki.

This changed later on and recently there have been fewer minor releases and bigger major releases delivered 3.5-4 months. Depending on your point of view this is good or bad. It now takes longer to get awaited features and fixes, but on the other hand the releases are more solid and better tested.

For major releases, Atlassian now usually offers Early Access Program, which gives you access to milestone builds so that you can see and mold the new stuff before it ships.

Contrary to the past, the minor versions have been very stable lately and have contained only bugfixes, so it is generally safe to upgrade without a lot of hesitation.

The same can’t be said about major releases. Even though the stability of x.y.0 releases has been dramatically improving lately, I still consider it risky for a big site to upgrade soon after a major release is announced. Wait for the first bugfix release (x.y.1), monitor the bug tracker, knowledge base and forums, and then consider the upgrade.

Having gone through many upgrades myself, I think that it is a good practice to stay up to date with your Confluence site. We have usually been at most one major version behind and frequently on the latest version, but as I mentioned avoiding the x.y.0 releases. This has been working well for us.

Staying in Touch and Getting Support

In order to know what’s going on with Confluence releases, it is a good idea to subscribe to the Confluence Announcements mailing list. This is a very low traffic mailing list used for release and security announcements only.

Atlassian’s tech writers usually do a good job at creating informative release notes, upgrade notes and security advisories, so be sure to read those for each release (even if you are skipping some).

There are several other channels through which people working on Confluence (plugin) development can communicate and support each other, these include:

Despite Atlassian’s claims about their legendary support, I found the official support channel rarely useful. Being a DIY guy and having a reasonable knowledge about Confluence internals, I usually found myself in need of a more qualified support than what the support channel was created for. For this reason my occasional support tickets usually ended up being escalated to the development team, instead of handled by the support team.

On the other hand the public issue tracker has been an invaluable source of information and a great communication tool. I wish that more of my bug reports had been addressed, but for the most part I have been receiving reasonable amount of attention even though sometimes I had to request escalation to have someone look at and fix issues that were critical for us.

The biggest hurdle I’ve been experiencing with bug fixes and support was that sites of our size are not the main focus for Atlassian and they are not hesitant to be open about it. I often shake my head when I see features of little value (for us that is – because they target small deployments and have little to do with core wiki functionality) being implemented and promoted, but major architectural issues, bugs and highly anticipated features go without attention for years. Just browser the issue tracker and you’ll get the idea.

Confluence Upgrades

The core of the upgrade procedure will depend on the build distribution type you use (standalone, war, building from source), but fundamentally in all cases, you need to shut down your Confluence, replace your app (standalone or war) with the new version and then start it again. An automated upgrade process will take care of updating the database schema, rebuilding the search index and other tasks required for a successful upgrade.

That was the good news, the bad news is that there is a lot more work to be done in order to successfully upgrade a site with as little downtime as possible.

Dev and Test Deployments and Testing

Before you upgrade the real thing, you should at first get familiar with the release by upgrading your dev and test environments.

It’s often handy to invite your users to do a brief UAT (user acceptance testing) on your test instance as they might catch something that you or your automated tests haven’t.

Picking the Outage Window

Based on your users’ usage patterns (as easily identified by web analytics solutions like Google Analytics), you should pick a time when the usage is low. For our global site this has been early mornings at around 4:30 or 5am PT.

When it comes to picking a day, we usually stuck with Tuesdays, Wednesday or Thursdays. Nobody wants to be dealing with an issue during a weekend when internal (infrastructure) or external (Atlassian) support is harder to get hold of.

You also want to communicate the planned outage to your users, so that they are not caught by surprise when you announce an outage on a day when they are releasing important documents on the wiki.

As far as outage duration goes, we usually plan for a 30min outage during a 1 hour window and most of the time have been able to bring the site back online within 30min or less.

Ready, Set, Go!

The actual deployment consists of several steps, which in our case are:

  • disabling load balancing for both nodes (which automatically triggers redirection of all requests to a maintenance pages hosted elsewhere)
  • shutting down both nodes
  • disabling MySQL replication between the master and slave db
  • taking ZFS snapshot of the Confluence Home directory
  • taking ZFS snapshot of the MySQL db filesystem on the master
  • deploying the new war file
  • starting one node (while the loadbalancer still ignores it)
  • watching container and Confluence logs for any signs of problems

At this point, we have one of our nodes up and running (hopefully :-)). We can log in with an admin account and check if everything works as expected. The next tasks include:

  • upgrading installed plugins
  • upgrading custom theme (if there is one)
  • running a bunch of automated or manual tests, just to verify that everything is ok

If things are looking good, we can allow the load balancer to start sending requests to our upgraded node. Continue watching logs and eventually deploy the war on the second node and re-enable the MySQL replication.

If any issues occur during the deployment, we can simply:

  • shut down the upgraded node
  • revert to the latest Confluence Home snapshot
  • revert to the latest MySQL db snapshot
  • redeploy the older version of war file
  • either retry the deployment or re-enable load balancer and deal work on resolving the issues outside of production environment

In my experience from all the dev, test and prod deployments, we’ve had to roll back and redo an upgrade from scratch only once or twice. It’s very unlikely that you’ll have to do it, but it’s better to be ready than sorry.

If you are building Confluence from patched sources and deploy your own builds frequently, then you might want to consider automating your deployments with tools like Capistrano. This will save you a lot of time and make the deployments more reliable and consistent.

Conclusion

If you do your homework, Confluence is quite easy to upgrade. It’s unfortunate that the entire cluster must be shut down for an upgrade even between minor releases, but if you plan your deployment well, you will be able to minimize the downtime to just a few minutes outside of peak hours.

In the next chapter of this guide, we’ll take a look at customizing and patching Confluence.

RPCFN: Cycle Tracks (#12)

Ruby Programming Challenge For Newbies

RPCFN: Cycle Tracks (#12)

By David Griffiths

Today, we complete one year of Ruby Programming Challenge for Newbies. RubyLearning is grateful to all the Ruby experts and participants who have actively helped make these challenges interesting and popular.

About David Griffiths

David Griffiths In David’s own words: “I’m an agile developer, writer and trainer based in the UK. I used to write a monthly Java development column and I’ve used and taught agile methods to companies around the UK. But I’ve been writing code since I was 12 years old. I worked with Java from the alpha release in the 90s. A lot of things on the client side as well as a lot of enterprise stuff. But everything changed for me when I got an early copy of Bruce Tate’s Up and Running with Ruby on Rails. I found myself in Boston for 3 days with nothing else to do. And anyone who’s been to Boston knows that it’s famous for two things: coffee and book shops. So I got a copy of Tate’s book and a laptop and spent three days burying myself deep into Rails and consuming more caffeine than was probably wise. It was incredible. Here was a way of doing things with a few simple commands, that would have taken 19 classes, an enterprise container and 3-400 lines of XML in Java. An old professor once told me “The profound is always simple” – and Rails was the living embodiment of that. I was hooked. It really hasn’t been the same since.”

Prizes

  • The participant with the best Ruby solution (if there is a tie between answers, then the one who posted first will be the winner) will be awarded any one of PeepCode’s Ruby on Rails screencasts.
  • From the remaining working Ruby solutions, three participants would be selected randomly and each one would be awarded any one of Pragmatic’s The Ruby Object Model and Metaprogramming screencasts.

The four persons who win, can’t win again in the next immediate challenge but can still participate.

The Ruby Challenge

RPCFN

The Challenge

The entire challenge details are available here.

Ensure that you submit both the solutions – see pages 2 and 3.

How to Enter the Challenge

Read the Challenge Rules. By participating in this challenge, you agree to be bound by these Challenge Rules. It’s free and registration is optional. You can enter the challenge just by posting the following as a comment to this blog post:

  1. Your name:
  2. Country of Residence:
  3. GIST URL of your Solution (i.e. Ruby code) with explanation and / or test cases:
  4. Code works with Ruby 1.8 / 1.9 / Both:
  5. Email address (will not be published):
  6. Brief description of what you do (will not be published):

Note:

  • As soon as we receive your GIST URL, we will fork your submission. This means that your solution is frozen and accepted. Please be sure that is the solution you want, as it is now recorded in time and is the version that will be evaluated.
  • All solutions posted would be hidden to allow participants to come up with their own solutions.
  • You should post your entries before midnight of 29th Aug. 2010 (Indian Standard Time). No new solutions will be accepted from 30th Aug. onwards.
  • On 30th Aug. 2010 all the solutions will be thrown open for everyone to see and comment upon.
  • The winning entries will be announced on this blog before 5th Sept. 2010. The winners will be sent their prizes by email.

More details on the RPCFN?

Please refer to the RPCFN FAQ for answers to the following questions:

Donations

RPCFN is entirely financed by RubyLearning and sometimes sponsors, so if you enjoy solving Ruby problems and would like to give something back by helping with the running costs then any donations are gratefully received.

Click here to lend your support to: Support RubyLearning With Some Love and make a donation at www.pledgie.com !

Acknowledgements

Special thanks to:

  • David Griffiths.
  • GitHub, for giving us access to a private repository on GitHub to store all the submitted solutions.
  • The RubyLearning team.

Questions?

Contact Satish Talim at satish [dot] talim [at] gmail.com OR if you have any doubts / questions about the challenge (the current problem statement), please post them as comments to this post and the author will reply asap.

The Participants

There are two categories of participants. Some are vying for the prizes and some are participating for the fun of it.

In the competition

  1. Alex Chateau, Latvia
  2. Kirill Shchepelin, Russia
  3. Sebastian Rabuini, Argentina
  4. Santosh Wadghule, India
  5. Juan Gomez, USA
  6. Julio C. Villasante, Cuba
  7. Paul McKibbin, UK
  8. Viktor Nemes, Hungary

Just for Fun

  1. Dmytrii Nagirniak, Australia
  2. Benoit Daloze, Belgium
  3. Cary Swoveland, Canada

The Winners

Winners

Congratulations to the winners of this Ruby Challenge. They are:

Previous Challenge

RPCFN: The Game of Life (#11) by Elise Huard.

Note: All the previous challenges, sponsors and winners can be seen on the Ruby Programming Challenge for Newbies page.

Update

  • This challenge is now closed.
  • The (#13) challenge by Bruce Scharlau, U.K. is scheduled for Sept. 2010.

Technorati Tags: , , , , ,

Cloud Computing To Drive Server Hardware

Cloud computing will be a key force behind hardware sales over the next few years, IDC believes. In fact, the investments are significant enough to be considered a whole new era in IT infrastructures and not just replacements, which highlights the quickly increasing interest in cloud computing overall.

DGC III: Confluence Configuration and Tuning

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at Confluence configuration and tuning.

There are four ways how one can modify Confluence’s runtime behavior:

  • Config Files in Confluence Home directory
  • Config Files in WEB-INF/classes
  • JVM Options
  • Admin UI

Config Files in Confluence Home directory

Confluence Home directory contains one or more config files that control runtime behavior of Confluence. The most important file is confluence.cfg.xml that must be present in order for Confluence to start. This file can be modified by hand while confluence is shut down, but also gets modified by Confluence occasionally (mostly during upgrades). Your changes will be preserved, as long as you made them while Confluence was offline.

Another relevant file is tangosol-coherence-override.xml which must unfortunately be used to override Confluence’s lame multicast configuration needed for cluster configuration (see

Continue reading “DGC III: Confluence Configuration and Tuning”

DGC III: Confluence Configuration and Tuning

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at Confluence configuration and tuning.

There are four ways how one can modify Confluence’s runtime behavior:

  • Config Files in Confluence Home directory
  • Config Files in WEB-INF/classes
  • JVM Options
  • Admin UI

Config Files in Confluence Home directory

Confluence Home directory contains one or more config files that control runtime behavior of Confluence. The most important file is confluence.cfg.xml that must be present in order for Confluence to start. This file can be modified by hand while confluence is shut down, but also gets modified by Confluence occasionally (mostly during upgrades). Your changes will be preserved, as long as you made them while Confluence was offline.

Another relevant file is tangosol-coherence-override.xml which must unfortunately be used to override Confluence’s lame multicast configuration needed for cluster configuration (see below).

Lastly there is config/confluence-coherence-cache-config-clustered.xml which contains configuration of the Confluence cache. Generally you don’t want to modify this file by hand. I’ll come back to talk about cache configuration later in the Admin UI section of this chapter.

In general it is advisable to be very consistent about your environment, so that you can then just have a single version of these files that you can distribute on all servers when needed. This includes the directory layout, network interface names, and so on.

A combination of the first two files will allow you to configure the following:

Clustering

As I mentioned, this configuration is split between two config files. confluence.cfg.xml contains confluence.cluster.* properties, which allow you to set multicast IP, interface and TTL, but not the port. Only tangosol-coherence-override.xml can do that.

The cluster IP is by default derived from a “cluster name” specified via the Admin UI or installation wizard. For some reason Atlassian believes that in an enterprise environment one can just let a software pick a random IP and port to run multicast on. I don’t know about any serious datacenter where things work this way. You’ll likely want to explicitly set IP, port, interface name and TTL and the only way to do that is by modifying these files by hand and ignoring the “cluster name” setting in the UI. Make sure that settings are consistent in both files.

DB Connection Pool

Confluence comes with an embedded connection pool. I believe that you can use your own too (if it comes with your servlet container), but I’d suggest sticking with the embedded one since it is widely used and Atlassian runs their tests with it also. The pool is configured via confluence.cfg.xml and its hibernate.c3p0.* properties. The most important property is pool max_size which will prevent the pool from opening more than a defined number of connections at a time. You want this number to be higher than your typical peak concurrent request count (are you monitoring that?), but not higher than what your db can handle. We have ours set to 300, which is double of our occasional peaks. Don’t forget that in order to take advantage of these connections, you’ll likely need to also increase the worker thread count in your servlet container.

DB Connection

The connection is configured via hibernate.connection.* properties in confluence.cfg.xml. Depending on your db, you might need to specify several settings for the connection to work well and grok UTF-8. For our MySQL db, we need to set the connection url to something like

jdbc:mysql://server:3306/wikisdb?autoReconnect=true&useUnicode=true&characterEncoding=utf8

Note that if you are editing this file by hand, you must escape illegal xml characters. More info about db connection can be found in the Confluence documentation.

Config Files in WEB-INF/classes

Just a side note: if you are building confluence from source then these files can be found at confluence/confluence-project/conf-webapp/src/main/resources/.

These files are the most cumbersome to work with because you need to apply your changes to them after each upgrade. I’ll describe how we use our automated patching machinery to do this in the future chapter of this guide. For now let’s just go over the available config files and what you can change here.

atlassian-user.xml – used to configure user provisioning, e.g. LDAP. For more info read the docs.

confluence-init.properties – this file allows you to specify the path to Confluence Home directory. There is a better way to set this; see the JVM Options section below.

log4j.properties – modify logging preferences, this can also be done via the UI, but AFAIK the changes are not preserved after restart or upgrade.

seraph-config.xml – controls authentication framework. You’ll likely need to modify this file if you have a custom authenticator and login page.

I should note that there are many other (usually xml) configuration files bundled with individual jars in WEB-INF/lib, but those rarely need to be modified.

JVM Options

Another way to configure certain settings is via JVM options. From the complete list of recognized options these are the ones we use:

-Dcom.atlassian.user.experimentalMapping=true – this is a critically important setting for us with 180k users. Without it, our cluster panics due to data overload (CONF-12319), unfortunately despite Atlassian’s claims that this experimental feature is production ready, it got broken soon after release, and then again recently, so you’ll have to patch atlassian-user module to get it to work.

-Dconfluence.disable.peopledirectory.anonymous=true – for big public deployments the people directory is a privacy risk and generally useless for anonymous users, we have it disabled for anonymous users.

-Dconfluence.disable.mailpolling=true – early on we decided that we don’t want people to build up mail archives on our site. While the feature is useful for small internal wikis, it’s too much of a risk with little reward to provide it on a public wiki. Unfortunately, this option only disables mail fetching. The UI for setting up mail archives will still be present in the wiki; you’ll have to patch Confluence to remove it.

I didn’t learn about -Dconfluence.home until recently. I would much prefer to use it than to mess with confluence-init.properties file in WEB-INF/classes.

Admin UI

Most of the Confluence settings can be configured via Confluence admin interface. The downside is that the configuration is not being versioned, and there is no easy way see diffs and to roll back unless you want to hack the db and replace data from backups. With that in mind lets look at the most important settings.

General Configuration

Server Base Url – make sure this is set up correctly, otherwise confluence and its plugins won’t work properly.

Users see Rich Text Editor by default – we have this set to off. In the past many RTE bugs were causing headaches to our writers especially those who did lots of editing. In Confluence 3.2 and 3.3 the editor has improved a lot and it might be the time for us to reconsider this decision.

CamelCase Links – this used to be one of THE wiki features in general a few years ago, but as wikis have matured and people started creating more and more content, the automatic linking started to cause more problems than help. We have it off.

Threaded Comments – very useful; make sure it’s on.

Remote API (XML-RPC & SOAP) – we have ours on, but I patched the remote api code to restrict access to it.

Compress HTTP Responses – OMG please turn this on if is isn’t already. It’s a major performance booster. Alternatively you might want to do the compression in your webserver as Tim pointed out in comments below.

JavaScript served in header – we have this on, but for better performance it should be off. Unfortunately that breaks many plugins and legacy code that uses obtrusive javascript. Since this option has been around for a while, it might be worth it to just set it to off and deal with the remaining broken things as they are identified.

User email visibility – we have this set to visible to admins only, but our power users found it too be a collaboration barrier so I patched the code and made emails visible to our global employees group in addition to the admin group. It would be nice if confluence allowed such a configuration out the of box.

Anonymous Access to Remote API – No sane person will leave this on. If I were in charge, I would go as far as removing it from Confluence product.

Anti XSS Mode – This is a very handy feature. Not 100% bulletproof, but it helped to significantly decrease the number of XSS exploits in Confluence since its introduction.

Attachment Maximum Size (B) – I mentioned this one already in the first chapter when discussing the db configuration. If you are running a cluster (or think that you will eventually run it), set this to some low value. Ours is 5MB.

Connection Timeouts – these options are pretty handy when you have lots of feed macros, gadgets and other plugins that pull contet from remote sites. In order to prevent worker thread pileup in your servlet container don’t go beyond the default 10sec (which is already pretty high).

Daily Backup Administration

As I previously mentioned, this backup feature is useless for anything but tiny sites. Disable it.

Manage Referrers

Collecting referrers is ok, but don’t display them publicly if you run a site on the Internet. Otherwise you run a risk of exposing some internal only URIs that might contain confidential information.

Languages

Most of our documentation and content is written in American English, but unfortunately Atlassian doesn’t provide such a language pack. I just patch the default Australian English pack to get a US English pack. It works great and is almost no hassle to maintain.

User macros

I discourage their use in enterprise environement. The lack of versioning, automated testing and documentation makes them a nightmare to maintain. Just create Confluence plugins for everything you need.

PDF Export Language Support

This is a tricky one. It took us quite a while to find the right single font that could be used to generate PDFs in almost all languages. Finally we found soui_zhs.ttf, which is distributed with OpenOffice. It’s a huge file, but it works like charm for all kinds of non-wester languages.

Themes

For reasons I’ll discuss later, we disabled all the themes except for our custom one, which is the global and default space theme. To disable a theme you have to go to plugins view and disable the appropriate theme plugins.

Cache Statistics

The name of this section in the UI is misleading, because not only can you view cache statistics here, but more importantly you can fully control the cache size via the UI. And in this case, I’m really glad that there is a UI to manage the cache config xml file, which due to its size is really hard to work with by hand. The changes you make via the UI are persisted in the Confluence Home directory and propagated thought the cluster.

Out of all the things you can tune via the admin UI, the cache tuning will have the biggest impact on your site’s performance. Confluence ships with cache settings optimized for smaller sites, so increasing the cache size is unavoidable for larger deployments.

Tuning the cache settings is a time-consuming process because you need to balance the memory consumption with performance improvements. Usually I revisit the cache stats once a month and look for caches that are performing badly because the number of objects allowed in that particular cache is low. Confluence caching system is composed of many caches that are controlled via this UI.

The best indicator of an overflowing cache is when the “Effectiveness” value is low (under 70-80%) AND “Percent Used” value is high (over 80%) AND usually the “Expired” value will be relatively high compared to “Hit” value in the same cell. This means that Confluence needs to go to the DB too often, even though it could cache the data in memory if the cache was bigger.

If you don’t understand what all the cache names and numbers mean, don’t worry about that too much. As long as you don’t make any dramatic changes too quickly and you monitor your JVM heap usage, you can’t break anything.

As you increase the cache sized, you’ll eventually start running out of heap space. That’s why you need to monitor the JVM and increase the -Xmx value as needed. If the number of concurrent users increases, you might also need to slightly increase the -Xmn value (see the JVM Tuning chapter for more info).

I wish Atlassian would provide better descriptions for all the available caches, because unless you know Confluence internals well, you won’t know what you are doing and that doesn’t feel good. Additionally, I’d like to see a way to limit memory usage, not the number of objects, because their size varies. Ideally, I’d really like to be able to just say “Use 3GB of memory for cache and distribute it in the most efficient way. Oh and let me know if you need more or less memory to work effectively”. It would be better if Atlassian moved away from an in-process cache which in my opinion is not a good fit for Confluence. Maybe we’ll get there one day.

Plugins

This section of the Admin UI is where you can install, uninstall, enable and disable plugins and their modules. There is also a Plugin Repository which additionally allows you to install plugins from Altassian’s remote servers or user specified URIs. The recently released Atlassian Universal Plugin Manager will eventually replace the latter one (or both?), I’m glad to see that happening.

I suggest that you disable plugins that you don’t use or don’t want your users to use as soon as possible. We disabled all the bundled themes because we wanted to provide users with only one custom theme developed and maintained by us (I’ll explain the reasoning in a future chapter). For security reasons thehtml and html-include macros should in my opinion be disabled on all but family Confluence deployments. And for performance reasons Confluence Usage Stats plugin is not suitable for any bigger deployments.

Plugin installation is very easy to do. That’s both good and bad. The plugin framework provided by Confluence is a very sophisticated piece of software which allows you to install and uninstall plugins on the fly without any need to restart the server. Need to quickly install a fixed version of a buggy plugin without disturbing hundreds or thousands of users that are currently using your site? Done. That’s how easy it is.

On the other hand, it is tempting to install plugins just because they have cool names or promise great features. You can do that in your dev or test environment, but in production you should only install plugins that you picked after some serious consideration.

This is what I look for when deciding whether to install a plugin or not:

  • was the functionality provided by the plugin requested by larger group of users or is the plugin needed for site administration purposes?
  • was the plugin developed and tested in-house, if no is it supported by Atlassian, if no can we or some respectable Atlassian partner support it should there be some problems?
  • is the plugin compatible with our confluence version? does it have a track record of being compatible or was it made compatible with new Confluence versions as they were released?
  • are there no major unresolved bugs in the areas of performance, scalability, data integrity and security?
  • does the plugin have an automated test suite with good test coverage?

If you answer “yes” to all of these questions, then you may go ahead do a trial before installing the plugin in production. Otherwise, you might provide your feedback to the plugin authors and wait if the pending issues get resolved before proceeding.

I don’t want to be harsh, but especially 2-3 years ago most of the plugins created for Confluence were crap. But as the platform matures, and Atlassian partners get involved more, the quality of available plugins has been slowly increasing. The main issue that I see is that the existing plugins are not developed and tested with large scale deployments in mind. Hopefully things will change as more and more deployments grow beyond small and medium sites. It’s unfortunate that even some commercial plugins, suffer from the very same issues that plague plugins created by bunch of volunteers and enthusiast. So pick your plugins carefully, do a trial, check for unresolved bugs and existing user complaints, and then decide.

I’ve been reasonably active in the Atlassian development community and from these interactions, I’d like to highlight the work done by Dan Hardiker (Adaptavist) and Roberto Dominguez (Comalatech). And though I haven’t worked with guys from CustomWare, they are also considered to be pretty sharp.

Be especially careful with plugins that provide new macros for the wiki content. Once you install such a plugin you won’t be able to uninstall it without breaking wiki pages until all the references to that macro are removed (with tens of thousands of pages and no ability to track the references this might be a big challenge).

In general however, try to keep the number of plugins low. It’s better for performance and you won’t get in trouble as often when you need to upgrade Confluence but some of the plugins you use are not compatible with the new Confluence version.

Conclusion

You should now have a good idea about how to configure Confluence and where this configuration is done. In the next chapters we’ll look at upgrading Confluence, patching and more.

July 30, 2010: I Always Thought It Was An Animal Native To The Rain Forest

Book Status

Beta 5 came out on Wednesday. Currently trying to figure out how to structure the Shoulda chapter in light of the direction that project has gone in since I wrote about it for the Lulu book.

Friday Links

One significant change in Rails 3 is that, because of the way Bundler works, the code for your gems is not part of the project. And if you are using RVM, each project might have a different gemset, and different directory to find those gems. Brian Cardarella has a simple script that will open a new tab in your terminal window and take it to the gem directory for the current project. OS X only, because it uses AppleScript. I will use this.

Mike Burns from Thoughtbot gives us a just-so story for the digital age, How grep got its name. I always thought it was the sound you make when you try and figure out how to use it, “Oh Grep!”

Derek Kastner of Brighter Planet has an interesting look at how to use more advanced features of Bundler to to manage gem dependencies when building a gem, and creating the gemspec. Definitely something I would not have figured out on my own.

Matt Aimonetti, after doing a little Ruby memory quiz/rant on Twitter last night has published a longer blog post about Ruby’s object allocation. This is interesting, and makes me wonder if it would be possible to build a Ruby runtime optimized for long-running processes. Still, make it work, make it right, make it clean, only then make it fast — it’s much easier to optimize clean code.

Louis Rose has a short snippet or three on using Timecop and Chronic to manage time-based Cucumber scenarios. Read through to the updates to avoid a couple of gotchas. Chronic, by the way is one of my favorite gems to use in projects, because clients often like the demo of being able to type in “next Tuesday” in a date field.

Finally, Yehuda Katz has what is maybe the first “I switched to Vim” story that makes me actually think about switching to Vim. Seems like a useful approach and set of tips.

Filed under: Bundler, Cucumber, Ruby, Time, unix, Vim, Yehuda

Dreamworks Big Into Cloud Computing

Future animations from Dreamworks SKG will not be rendered on its own servers anymore, but outsourced to Cerelink, the studio announced. The company said that it will use Cerelink’s supercomputing infrastructure at the New Mexico Applications Centre (NMCAC). The advantage over its own supercomputers is the availability of supercomputers on demand, a feature that is commonly referred to as elastic cloud computing.

Interesting breakage in Rails 3 RC

My latest project, UploadJuicer is running on Rails 3, and I’m loving it. Back in the Rails 3 beta 1 days, there were still a lot of rough edges, but beta 4 has been great so far. Until I wanted to upgrade UploadJuicer to Rails 3 RC (1). After the upgrade, I got this:

ActionController::RoutingError (uninitialized constant ApplicationController::AuthenticationHelpers)

AuthenticationHelpers happens to be the first module I include from ApplicationController, and it lives in lib/authentication_helpers.rb. After a bid of head-scratching as to why something that has worked for ages and ages in Rails suddenly stopped working (and in between a beta and an RC release, to boot!), I remembered this change in config/application.rb from running rake rails:update:


# Custom directories with classes and modules you want to be autoloadable.
# config.autoload_paths += %W(#{config.root}/extras)

That looked promising, so I uncommented it, changed extras to lib, and bam, problem solved. Life on the edge is an adventure. 🙂

#126: Jason’s Helper

In this episode, Jason and Dan congratulate the Rails core team for their fantastic release timing (right before we air instead of after).

Links for this episode:

Clojure: A Chat with Andrew Boekhoff

In this brief interview, Satish Talim of RubyLearning talks to Andrew Boekhoff, author of CongoMongo, a toolkit for using MongoDB with Clojure.

Satish>> Welcome Andrew and thanks for taking out time to share your thoughts. What programming languages have you used seriously?

Andrew>> Seriously: Ruby and Clojure. Less Seriously: C, C++, Java and now: Haskell, Scheme.

Satish>> Why and when did you decide to start working on Clojure?

Andrew>> I’ve been using Clojure for a little over a year. I had read Paul Graham’s essays, so I wanted to try a lisp dialect. I also wanted to learn what functional programming was all about. Then I watched Rich Hickey’s presentations on Clojure and by that point I was pretty much sold.

Satish>> Could you name three features of Clojure that you like the most, as compared to other languages? Why?

Andrew>>

  1. Immutability: Using immutable locals and data structures as the default eliminates a huge class of potential errors. I’ve never written as much code that worked on the first try in any other language. Concurrency is often mentioned as a great benefit from pervasive immutability — and it certainly is — but for me, the net reduction in complexity is what I love most.
  2. It’s a Lisp: It has Macros: Whether its for shearing off boiler plate, or embedding a parser for an internal DSL, the ability to easily extend the syntax of the language is a uniquely expressive trait of the lisp family.
  3. The immense practicality of the JVM: By being hosted on the JVM, Clojure comes with batteries-included and can be deployed anywhere that Java can (almost anywhere).

Satish>> You have written a Clojure wrapper (congomongo) for the mongo-db java api. Can you tell us more about this wrapper? Also, why did you target MongoDB?

Andrew>> I really like working with MongoDB. The combination of schema-less document storage and ad-hoc queries is fantastic. The JSON format fits Clojure’s data structures well, and the mongo-java-driver is high quality and maintained. Congomongo is fairly light-weight — its main goal is to make interacting with the database from Clojure convenient and idiomatic.

Thank you Andrew. In case you have any queries and/or questions, kindly post your questions here (as comments to this blog post) and Andrew would be glad to answer.

Technorati Tags: , , ,

Everyone Who Tried to Convince Me to use Vim was Wrong

A couple weeks ago, I took the plunge and switched to vim (MacVIM, to be precise). It wasn’t the first time I tried to make the switch, and I had pretty much written it off entirely.

Why? Because the past few times I tried switching to vim, I took the advice of a master vim user, and quickly sunk into the quicksand of trying to learn a new tool. In every prior attempt, I gave vim a few days before I gave up. And every time, I managed to get virtually no work done the entire time, spending about 90 percent of my day fighting with my editor (a more charitable way to put it would be “learning my editor”).

Invariably, the master vim users that were helping me make the switch would encourage me to stick it out. “If you just give it a few weeks, you’ll never want to switch back.”

The trouble was, I had work to do. I could only switch editors if the new editor did not significantly impede on my day-to-day work. I can already hear the responses: “That’s simply impossible. It’s a new editor designed for advanced users. You’ll just have to put up with the pain until you get used to it.”

Here’s the thing, though: I didn’t really have to put up with a huge amount of pain when switching to Textmate for the first time. In fact, it was downright pleasant.

The last few times someone tried to get me to switch to vim, I issued them a simple challenge. Can you tell me a way to switch that will not significantly reduce my productivity for the first few weeks. It wasn’t a challenge that was intended to fully shut down discussion. When I really thought about it, Textmate wasn’t doing all that much for me. It was a glorified Notepad which had working syntax highlighting and understand where to put the cursor when I hit enter (most of the time).

I don’t actually use “snippets” all that often, or all that many “commands”. I don’t mind the extensibility of Textmate, but I’m not a hardcore Textmate hacker myself, meaning that I’m ok with any editor that has the same level of extensibility that Textmate has (namely, all of them).

Despite what I considered a relatively reasonable request, my challenge was met with disdain and even anger by most of the people I talked to. “If you feel that way, Vim probably isn’t for you.” “You’re learning a new EDITOR for God’s sakes. Of COURSE there’s going to be a learning curve.”

I had written off the entire sorry affair.

A few weeks ago, Carl told me that he was playing with Vim. His explanation was that he had seen a number of people be really productive with it, and he was curious. Carl is definitely willing to put up with more pain to learn something new than I am, so I issued the same challenge to him.

Perhaps because he wasn’t steeped in hardcore vim hacker lore, he didn’t angrily dismiss the entire premise of my question. Thinking about it a bit more, I realized that most of the people who had tried to get me into vim had suggested that I dive in head first. “First thing: turn off the arrow keys.” “Don’t use the mouse. Force yourself to use the keyboard.”

Carl convinced me to use vim for the first couple of days pretty much exactly as I use Texmate (with the exception of having to switch between normal and insert modes). I installed NERDTree on MacVIM, grabbed the most common vim “packages”, and was off to the races. (I should note that I installed topfunky’s PeepOpen, which definitely helped with a very common workflow that I find it hard to live without).

For the first day, I clunked around by using my mouse’s scroll wheel, clicking and highlighting things, and spending most of my time in insert mode. It was slightly less productive than Textmate, but mostly in the range of what I’d expect switching to a new tool. In short, while I felt a bit out of sorts, I was able to get plenty of work done that first day.

As the days went on, I learned a few commands here and there. The first big one for me was ci as in ci " (it means: replace what’s inside the next set of " and go into insert mode). This singlehandedly made up for most of the productivity losses I was feeling from learning a new tool. Throw in o, O, A, :N and /search and I was already quite a bit more productive than I had been in Textmate.

Sure, I’m still plodding around in some cases, but only a handful of days later, using Textmate for anything feels clunky (most commonly, I try to use o or O to insert a new line above or below the one I’m currently on).

I was able to get here because I used my mouse wheel and button, arrow keys, apple-f to find text, apple-s to save files, and a whole slew of other common idioms, instead of grinding to a halt and trying to switch all of my practices at once.

To those who would say “that’s obvious; of course you learn vim incrementally”, I would simply say that having spoken to a number of vim users in the past, I never got that advice. Instead, I got a lot of advice about turning off my arrow keys, disallowing the use of the mouse, and learning the (MORE EFFICIENT!!!) vim ways to do everything, all at once. People just couldn’t stomach the idea of me continuing to use an outmoded practice (like apple-f) when vim had much better tools available just a (huge volume of) memorization away.

To those who are considering using vim, my recommendation is to use MacVIM, NERDTree, PeepOpen (or command-t), and use the mouse, arrow keys, and familiar OSX’isms all you want. Very quickly, it will become obvious that there’s a better way to do all kinds of things, and you can pile on the newly found efficiency once you’ve successfully made the switch without losing the ability to do work in the short-run.

Michael Hartl’s Rails 3 Tutorial Book

The Ruby on Rails Tutorial: Learn Rails by Example (a.k.a. railstutorial.org) by Michael Hartl has become a must read for developers learning how to build Rails apps. Michael has put together a great Rails 2.3 tutorial, releasing it all for free online chapter by chapter. Now, Michael’s going three steps further:

1 — A new, Rails 3.0 focused version. The free online book previously covered Rails 2.3 but Michael’s updated it to cover Rails 3.0 too. He’s also selling it as a DRM-free PDF for $39 (you get a PDF of the Rails 2.3 version too). As a gesture of goodwill to Ruby Inside’s readers, he’s made a coupon code that works till the end of August – it’s rubyinside01 and gets you 20% off (so a total of $31.20 in the end).

2 — Creative Commons licensing of the existing online text. Like all of us, Michael needs to make some money, but a side benefit is that he’s making the existing Rails 2.3 focused text Creative Commons licensed! This will allow you to distribute it, translate it, put snippets on your blog, and so forth. (Update: Michael notes that this is sort of in the air at the moment pending more resources. He has more preparation to do to make this work properly, but the spirit is there.)

3 — A print book, published by Addison-Wesley. A print edition, Ruby on Rails 3 Tutorial: Learn Rails by Example, is due out in the fall as part of the Professional Ruby Series (the same series as The Rails 3 Way by Obie Fernandez), and is currently available for pre-order at Amazon.

As a bit of a “geek aside”, the Ruby on Rails Tutorial book is written using PolyTeXnic, a pure-Ruby markup system built on top of the LaTeX typesetting system. PolyTeXnic converts a select subset of LaTeX to HTML, while also producing PDFs via the pdflatex command.

In addition to supporting code-heavy programming books such as Ruby on Rails Tutorial, PolyTeXnic can also produce mathematical documents; see, for example, Michael Hartl’s anti-pi propaganda piece called The Tau Manifesto. Michael hopes to release PolyTeXnic as an open-source project some time later this year. I keep nagging him about it.

July 28, 2010: Mathematical Navels

Book Status

Beta 5 still in progress. Probably today. No other news to report.

And In Other News, My Navel Is Still There

It’s been a little more than three months since I started doing these more-or-less daily blog posts here, which is far and away the longest I’ve ever sustained daily blogging.

The original idea of this was that it was going to be my daily standup for the Rails Test Prescriptions book, which would force me to do something on the book almost daily since I’d be reporting on it. Given that this was meant to be a stand-up, three months seems like a reasonable amount of time to have an agile retrospective, right? The secondary goal was as a place for potential book readers to go to learn about the book, get a sense of whether I have anything interesting to say, all that standard author stuff.

Retrospect Away

  • It’s been useful to me to have this place to mention book progress, and I do think it’s helped me keep momentum.
  • I started doing the link posts because I erroneously thought that a previous source of link posts had stopped. I like doing it, though I’ve recently been a bit more careful about only putting a link up if I have more than a couple of sentences worth of stuff to say about it. To some extent, this has led to fewer posts, which partially defeats the purpose.
  • I always wish I had more time to spend on these posts.
  • Traffic is still rather low, although it’s growing somewhat slowly. I’m a strikingly bad self-marketer, but there’s probably something I could do to improve traffic. (Better content!)

Okay, one link. Well, two

Kevin Kelly is keeping a list of nominations of the best magazine articles of all time. It’s striking how many David Foster Wallace articles there are — he’s one of my all-time favorite non-fiction writers (ironic, since he’s primarily known for fiction). You can always tell when I’ve been reading Wallace, because the number of footnotes and meta-commentary in my writing goes way up. If you’ve never read Wallace, math fans might like Everything and More, which is a (really) long essay on infinity. It’s digressive, filled with meta-writing about how he’s trying to explain stuff, and I think it’s pretty awesome. Though I note from the Amazon reviews that a lot of people like it less than I do.

While I’m in the neighborhood, I saw a link yesterday to a book called Street Fighting Mathematics, by Sanjoy Mahajan. Haven’t read it yet, but it’s based on an MIT course in quick and rough math problem solving, which sure sounds like it’d be useful.

Filed under: Math, Meta

Mailman – Like Sinatra for E-mail

http://blog.titanous.com/post/867488976/mailman-released (or on Ruby Inside)

Mailman is an incoming email processing microframework. You point it at a source of email, such as a POP3 account or a Maildir, and it will execute routes based on the messages that come in.

For instance if you had a ticketing system, and wanted to add replies via email to the database, this application would be a good start:

require 'mailman'

Mailman.config.maildir = '~/Maildir'

Mailman::Application.new do
  to 'ticket-%id%@vipsupport.com' do

    Ticket.find(params[:id]).add_reply(message)
  end
end

Jonathan Rudenberg

This is very nice. I love these microframeworks. The Sinatra style is always good to ape.

Exploiting Enterprise Software (like Hazelcast) with JRuby

http://cogitations.arbia.co.uk/post/863857348/distributed-ruby (or on Ruby Inside)

Due to the nigh insurmountable work of Charles Nutter, Thomas Enebo, Ola Bini and Nick Sieger along with their team we have direct access to Java libraries and thus to a plethora of usefulness. Sometimes I think we forget how lucky we are, the Ruby community, to have such awesome people simplifying our lives, anyway, thats quite enough arse kissing. So, on with the show…

Anthony Buck

In Distributed Ruby – Exploiting Enterprise Software, Anthony riffs on using JRuby as a way for Rubyists to “exploit” robust, enterprise-grade libraries. This isn’t a new topic but his demonstration is particularly compelling (and worth scrolling down for).

Hazelcast is an “in memory data grid” that’s fail-safe (with regard to crashes) and that automatically scales across the number of nodes within the system. Anthony’s code works fine out of the box with JRuby 1.5.1 running under RVM (yep, I tested) and demonstrates what is a particularly powerful Java library I’d never heard of before. It’s excellent we have access to this from Ruby, and Anthony’s right – the JRuby team deserves credit for giving Rubyists opportunities to both “exploit” enterprise technologies and to deploy systems in enterprise environments without having to jump ship with Ruby.

Government Approves Cloud Apps

If there is a concern about cloud applications that we here of these days, then it certainly is the security of cloud apps. Where is the data stored? How secure is the environment? Can it be as secure as if you were to store the data locally? Of course it can and we invite to contact us for details how secure your data is. However, Google has some news that will change perceptions.

July 27, 2010: No Rails Release Shall Escape My Sight

Book Status

Beta 5 should be out today, with the legacy and the Rails 3. Next up are the Shoulda and RSpec chapters, starting with figuring out how to handle the changes in Shoulda since I last wrote the chapter.

Rails

I’m sure all of you within the interest circle of this blog already know that Rails 3.0 RC 1 was released yesterday. Part of me wants to say “finally”, but that really isn’t fair. Doesn’t look like there are dramatic changes from Beta 4, but check out the release notes.

And another thing

So Larry Doyle, author of the Go Mutants! book that I reviewed yesterday, re-tweeted me this morning, which I suppose means he read the review. That kind of thing always surprises me more than it should, given how often I Google my own name…

Links

Okay, nobody else is going to care, but here is Ryan Reynolds from ComicCon reciting the Green Lantern Oath for a small fan during the panel discussion of the upcoming movie. Green Lantern was, somehow, my favorite character when I was a kid, and it’s great to see Reynolds do the line without a hint of ironic winking.

37Signals posts some information on their production database setup. This kind of thing is incredibly useful, but for obvious reasons kind of hard to come by. So, thanks.

Thoughtbot announced the release of Flutie, which is a “not CSS framework” distributed as a Rails engine. It seems like a set of intelligent CSS defaults that can be used to make something look good by a non CSS-guru developer, but which still allows a CSS expert to use a site layout framework on top. Looks helpful.

This is a couple of months old from Takaaki Kato, but it’s a nice series of TextMate tips for Rails developers. I’d add that I’ve always had trouble with ProjectPlus — it’s always had performance problems for me. A very handy set of tips, though.

Filed under: 37signals, CSS, Geek Out, Rails 3

DGC II: The JVM Tuning

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, I’ll be focusing on JVM tuning with the aim to make our Confluence perform well and operate reliably.

JDK Version

First things first: use a recent JDK. Java 5 (1.5) has been EOLed 1.5 years ago, there is absolutely no reason for you to use it with Confluence. As George pointed out in his presentation, there are some significant performance gains to be made just by switching to Java 6 and you can get another performance boost if you upgrade from an older JDK 6 release to a recent one. JDK 6u21 is currently the latest release and that’s what I would pick if I were to set up a production Confluence server today.

If you are wondering about which Java VM to use, I suggest that you stick

Continue reading “DGC II: The JVM Tuning”

DGC II: The JVM Tuning

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, I’ll be focusing on JVM tuning with the aim to make our Confluence perform well and operate reliably.

JDK Version

First things first: use a recent JDK. Java 5 (1.5) has been EOLed 1.5 years ago, there is absolutely no reason for you to use it with Confluence. As George pointed out in his presentation, there are some significant performance gains to be made just by switching to Java 6 and you can get another performance boost if you upgrade from an older JDK 6 release to a recent one. JDK 6u21 is currently the latest release and that’s what I would pick if I were to set up a production Confluence server today.

If you are wondering about which Java VM to use, I suggest that you stick with Sun’s HotSpot (also known as Sun JDK). It’s the only VM supported by Atlassian and I really don’t see any point in using anything else at the moment.

Lastly it goes without saying that you should use -server JVM option to enable the server VM. This usually happens automatically on server grade hardware, but it’s safer to set it explicitly.

VM Observability

For me using JDK 6 is not just about performance, but also about observability of the VM. Java 6 contains many enhancements in the monitoring, debugging and probing arena that make JDK 5 and its VM look like an obsolete black box.

Just to mention some enhancements, the amount of interesting VM telemetry data exposed via JMX is amazing, just point a VisualVM to a local Java VM to see for yourself (no restart or configuration needed). Be sure to install VisualGC plugin for VisualVM. In order to allow remote connections you’ll need to start the JVM with these flags:

-Dcom.sun.management.jmxremote.port=some_port
-Dcom.sun.management.jmxremote.password.file=/path/to/jmx_pw_file
-Djavax.net.ssl.keyStore=/path/to/your/keystore
-Djavax.net.ssl.keyStorePassword=your_pw

Unless you make the port available only on some special admin-only network, you should password protect the JMX endpoint as well as use SSL. The JMX interface is very powerful and in the wrong hands could result in security issues or outages caused by inappropriate actions.

For more info about all the options available read this document.

In addition to JMX, on some platforms there is also good DTrace integration which helped me troubleshoot some Confluence issues in production without disrupting our users.

And lastly there is BTrace that allowed me to troubleshoot a nasty hibernate issue once. It’s a very handy tool that as opposed to DTrace, works on all OSes.

I can’t stress enough how important continuous monitoring of your Confluence JVMs is. Only if you know how your JVMs and app are doing, you can tell if your tuning has any effect. George Barnett has also a set of automated performance tests which are handy to load test your test instance and compare results before and after you make some tweaks.

Heap and Garbage Collection Must Haves

After upgrading the JDK version, the next best thing you can do is to give Confluence lots of memory. In the infrastructure chapter of the guide, I mentioned that you should prepare your HW for this, so let’s put this memory to use.

Before we set the heap size, we should decide between 32-bit JVM and 64-bit JVM. 64-bit VM is theoretically a bit slower, but allows you to create huge heaps. 32-bit JVM has heap size limited by the available 32-bit address space and other factors. 32bit OSes will allow you to create heaps up to only 1.6-2.0 GB. 64bit Solaris will allow you to create 32bit JVMs with up to 4GB heap (more info). For anything bigger than that you have to go 64bit. It’s not a big deal, if your OS is 64bit already. The option to start the VM in 64bit mode is -d64. On almost all platforms the default is -d32.

Before I go into any detail, I should explain what are the main objectives of heap and garbage collection tuning for Confluence. The objectives are:

  • heap size – we need to tell JVM how much memory to use
  • garbage collector latency – garbage collection often requires that the JVM stops your application, this is GC pauses are often invisible, but with large heaps and under certain conditions might become very significant (30-60+ seconds)

Additionally we should also know a thing or two about how Confluence uses the heap. The main points are:

  • Objects created by Confluence and stored on the heap generally fall into three categories:
    • short-lived objects – life-cycle of these is bound to a http request
    • medium-lived objects – usually represent cache entries with shorter TTL
    • long-lived objects – represent cache entries with big TTL, settings and infrastructure objects (plugin framework, rendering engine, etc), cache entries taking most of the space.
  • Confluence creates lots of short-lived objects per request
  • Half or more of the heap will be used by long-lived cache objects

By combining our objectives with our knowledge of Confluences heap profile, our tuning should focus on providing enough heap space for the application to have space for the cache, short-lived objects, as well as some extra buffer. Given that long-lived objects will (eventually) reside in the old generation of the heap, we want to avoid promoting short-lived objects there, because otherwise we’ll then need to do massive garbage collections of the old generation unnecessarily. Instead we should try to limit the promotion from young generation only to those objects, that will likely belong to the long-lived category.

We’ll also need to figure out how much heap you need to use. Unfortunately there isn’t an easy way to find this out, except for some educated guessing and trial & error. You can also read this HW Requirements document from Atlassian that can give you an idea about some starting points. I believe we started at 1GB, but over time went through 2GB, 3GB, 3.5GB, 4GB, 5GB all the way to 6GB.

The Confluence heap size depends on the number of concurrent users and the amount of content you have. This is mainly because Confluence uses a massive (well, in our case it is) in-process cache that is stored on the heap. We’ll get to Confluence and cache tuning in a later chapter of this guide.

So let’s set the max heap size. This is done via -Xmx JVM option:

-Xmx6144m 
-Xms6144m

The additional -Xms parameter says that the JVM should reserve all 6GB at startup — this is to avoid heap resizing which can be slow, especially when dealing with large heaps.

The rest of the heap settings in this post are based on 6GB heap size, you might need to make appropriate changes to adjust for your total heap size.

The next JVM option is -Xmn, which specifies how much of the heap should be dedicated to young generation (you should read up on generational gc if you don’t know what I’m talking about). The default is something like 25% or 33%, I set the young generation to ~45% of the entire heap:

 -Xmn2818m

Increasing the permanent generation size is also usually required given the number of classes that Confluence loads. This is done via -XX:MaxPermSize option:

-XX:MaxPermSize=512m

Given that determining the right heap size for your environment is non-trivial task for larger instances, especially if occasional memory leaks start consuming the precious memory, you always want to have as much data as possible to debug memory exhaustion issues. Aside from good monitoring (which I mentioned in the previous chapter) you should also configure your JVM to dump the heap, when an OutOfMemoryException occurs. You can then analyze this heap dump for potential memory leaks.

Since we are dealing with relatively big heaps, make sure you have enough space on the disk (heap dumps for 6GB heap usually take 2-4GB). I’ve had a very good experience using Eclipse Memory Analyzer to analyze these large heaps (VisuaVM or jhat are not up for analyzing heaps of this size). The relevant JVM options are:

-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/some/relative/or/absolute/dir/path

While trying to minimize gc latency in order to avoid situations when users have to wait several seconds for the stop-the-world (STW) gc to finish before their pages render is a commendable thing to do, the main reason why you want to do this is to avoid Confluence cluster panics.

Confluence has this “wonderful” cluster safety mechanism that is sensitive to any latency bigger than a few tens of seconds. In case a major STW gc occurs, the cluster safety code might announce cluster panic and shut down all the nodes (that’s right, all the nodes, not just the one that is misbehaving).

In order to be informed of any latencies caused by gc, you need to turn on gc logging. This is the magic combination of switches that works well for me:

-Xloggc:/some/relative/or/absolute/path/wikis-gc.log 
-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps 
-XX:+PrintGCDateStamps 
-XX:+PrintTenuringDistribution

Unfortunately the file specified via -Xloggc will get overwritten during a jvm restart, so make sure you preserve it either manually before a restart or automatically via some restart script. Additionally reading the gc log is a tough job that requires some practice and since the format varies a lot depending on your JDK version and garbage collector, I’m not going to describe it here.

Performance tweaks

The first performance boosting JVM option I’d like to mention is -XX:+AggressiveOpts, which will turn on performance enhancements that are expected to be on by default in the future JVM versions (more info).

If you are using 64bit JVM then -XX:+UseCompressedOops will make a big difference and will virtually eliminate the performance penalty you pay for switching from 32bit to 64bit JVM.

And lastly there is -XX:+DoEscapeAnalysis which will boost the performance by another few percents.

Optional Heap and GC tweaks

To slow down object promotion into the old generation, you might want to tune the sizes of the survivor space (a heap generation within the young generation). To achieve this, we want the survivor space to be slightly bigger than the default. Additionally I also want to keep the promotion rate down (objects that survive a specific number of collections in the survivor space will be be promoted to the older generation), so I use these options:

-XX:SurvivorRatio=6
-XX:TargetSurvivorRatio=90

I also found that by using parallel gc for the young generation and concurrent mark and sweep gc for the older generation I can practically eliminate any significant SWT gc pauses. Your mileage might vary on this one, so do some testing before you use it in production. These are the settings I use:

-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=68
-XX:MaxTenuringThreshold=31
-XX:+CMSParallelRemarkEnabled

Resources

The information above was gather from years of experience as well as various sources, including the following:

Running Multiple Web Apps in one VM

Don’t do that. Really. Don’t. Bad things will happen if you do (OOME, classloading issues etc).

Conclusion

Your JVM should now be in a good shape to host Confluence and serve your clients. In the next chapter of this guide I’ll write about Confluence configuration, tuning, upgrades and more.