AMD versus CJS. What’s the best format?
by John Hann
at 2011-10-01 12:46:23
original http://unscriptable.com/2011/09/30/amd-versus-cjs-whats-the-best-format/
JavaScript Harmony will introduce native modules into JavaScript. Finally. And when IE9 finally dies a slow death, we’ll be able to use native modules in the browser. Yay corporate support contracts. In the mean time, there are two competing standards for modules: AMD and CJS Modules 1.1.1. Which one is better? Which one should you be using?
I’ve got an opinion and it may surprise you.
What’s the difference?
If you’re not aware of either format, I suggest you start by reading about Asynchronous Module Definition and CommonJS Modules 1.1.
In a nutshell, AMD is designed for browser environments and CommonJS is targeted at server-side environments. This fundamental difference has deep repercussions on the formats. Here’s a brief (and probably incomplete) summary of the differences between these two environments:
- Scoping
- Distance
- Asynchrony
Scoping
JavaScript that runs in the browser has no way to isolate code completely. Sure, you can put your code in a function so your var
statements don’t leak variables into the global scope, but it’s easy to make mistakes. Who hasn’t done this?
var a = b = 0;
AMD can’t efficiently solve this problem, so it doesn’t. Instead, it encourages local scoping by bringing external resources into a local function scope.
CJS arose from server-side environments, such as node, that started with a clean slate. In these environments, there is no global scope.
Distance
In the browser, most resources are remote, on a server somewhere. If you’re lucky, your resources are cached. Regardless, it can take time to fetch a resource. If your code depends on a function in another file, you can’t know for sure if it’s already loaded or if it may take a few seconds to arrive. You can’t write “normal” code if you don’t know if the function is defined yet!
CJS gets around this problem by assuming that all resources are local and can be fetched in msec or, more likely, microseconds.
Asynchrony
A direct result of the distance issue is the need to run tasks asynchronously. Browsers run JavaScript in a single thread. If that thread were to wait for a remote resource to be fetched, nothing else could execute and the browser would appear to be unresponsive to the user.
We get around this by initiating the fetch and requesting a callback when the resource is ready. The callback is typically in the form of a function, and is typically called … wait for it .. a callback function. (JS devs are so freakin smart I tell you.)
Node encourages asynchrony by emphasizing an event-driven programming model. However, the node creators punted when designing the API to fetch modules and provided the same stale File I/O APIs for fetching other resources. Personally, I think it would have strengthened the event-driven philosophy if they extended the event-driven philosophy to modules and file-based resources.
In a nutshell, CJS fetches module resources synchronously. I guess they thought it makes sense to not introduce the complexity of asynchrony if resources are only microseconds away.
A Closer look at the formats
So how do the rival formats handle their diverse target environments? Let’s look at AMD:
// define() is a necessary wrapper. define( // dependencies are specified in advance. ['pkgA/modA', 'pkgA/modB', 'pkgZ/modC'], // the module is declared within a definition function. // dependencies are mapped into function parameters. function (modA, modB, modC) { // inside here is the module's code. // the module is exported to the outside world via the // the definition function's returned value. return modC ? modA : modB; } );
Contrast this with CJS:
// there is no explicit wrapper, we just write the module's code. // this is not global scope! the scope is limited to this file only. // "free" variables (`require`, `module`, and `exports`) are // declared to help us define our module and fetch resources. // dependencies are specified as needed var modC = require('pkgZ/modC'); // the module is exported by decorating the `exports` free variable. exports.foo = modC ? require('pkgA/modA') : require('pkgA/modB');
AMD handles the asynchrony by having us declare our module inside of the definition function. The definition function is essentially a callback function. It’s called when all of our dependencies have been resolved. Some of these may require fetching. The execution of our module’s code is delayed until all of our resources are available. (Note: you can execute code earlier if you simply place it outside of the define()
.)
CJS on the other hand prefers to execute the module code as soon as possible, but load/execute the module dependencies only when needed. In the sample code above, only one of ‘pkgA/modA’ or ‘pkgA/modB’ will be loaded and executed. There’s an added bonus here! Our code could be significantly more efficient if the environment doesn’t have to load and execute all of the modules.
This is a clear win for CJS! Or is it? In the browser, there’s no way to have it both ways with this syntax. We’d either have to fetch the dependency in advance or fetch it synchronously.
(Ha ha! AMD has an solution to this problem that works for many, but not all, use cases. It’s called has.js and it has a sister plugin, called “has!”. (The exclamation mark is part of the plugin name; I didn’t append my emotion to it. Well, it is my emotion, but that’s not what I meant to convey. What I mean to say is… nm. whatevs.))
How about code size? From the example above, CJS is the winner here, for sure. … Or is it? If you were to execute that code in a browser, you’d be declaring globals left and right while clobbering variables in every other module. Clearly, we would need to at least wrap the module in an IIFE to prevent the module from running in the global scope.
Also, in the example above, I showed the advantage of delayed execution which allowed me to declare less variables. If you were to map all of the require()
-ed modules to variables (and wrap the module in an IIFE), you’d end up with nearly the same number of bytes.
And the winner is….
So, if you asked me which one is the winner, I’d have to say neither. To be honest, I plan to use both formats.
AMD has some clear benefits in the browser world. For a more complete list, check out Miller Medeiros’s recent blog article on why AMD is better. I couldn’t say it any better than he did.
On the other hand, CJS advocates will argue that it’s critical to author modules in a format that doesn’t cater to the transport (more on this later). I understand and almost agree with their argument, but this article would be way too tl;dr if I were to extrapolate on that argument.
I’m guessing that both formats will be around for a long, long time. So here’s what I plan to do:
- write modules that could execute in a server environment in CJS format
- write modules that could benefit from AMD’s browser-friendly features in AMD format
Duh.
One more thing…
Did I mention that AMD supports wrapped CJS modules? If you read the wikis I linked at the top of this article, I wouldn’t have to tell you that because you’d already know. Here’s what one looks like:
define( ['pkgA/modA', 'pkgA/modB', 'pkgZ/modC', 'require', 'exports', 'module'], function (modA, modB, modC, require, exports, module) { var modC = require('pkgZ/modC'); exports.foo = modC ? require('pkgA/modA') : require('pkgA/modB'); });
In this hybrid module, we’re prefetching modules like AMD, but using require()
like an R-Value as in CJS. Funky! The require
and exports
“free” variables are brought into the modules using reserved dependency names. In this hybrid model (using CommonJS-ish terminology) AMD is the transport format and CommonJS is the authoring format. (Thanks to @tobie for reminding me of this. Repeatedly. The SproutCore guys have also given me a tongue lashing or two.)
RequireJS‘s companion script, r.js, will generate these wrappers for your CJS modules.
I’ve been experimenting with a slight variation on this. It compacts better and executes a bit faster, but munges the transport and authoring formats in the process:
define( ['pkgA/modA', 'pkgA/modB', 'pkgZ/modC', 'require', 'exports', 'module'], function (modA, modB, modC, require, exports, module) { // ok. this looks funny, but it's much easier and more efficient // to simply replace require() with a variable name than to figure out // which variable declarations can be removed and how to do that safely. // I may experiment some more. : ) var modC = modC; exports.foo = modC ? modA : modB; });
winning.
[Update 2011-10-01]
I’m trying not to wreck a nice ending on this blog post, but I just realized a big mistake on my part. I often get confused between CJS and the latest flavor of node. CJS is missing a very important feature, imho.
With CJS, you cannot have a module that returns a single function or constructor. If we are writing modules as focused and fine-grained as possible, then we should be able to do this!
Node allows us to assign our module to the `module.exports` property. A module that only exports a constructor could then look like this:
function MyConstructor () {} module.exports = MyConstructor;
CJS needs to add this for ms to be 100% on board. Hm. I guess I’m not really authoring CJS modules now. They’re node modules.