Node.js: Callbacks are polluting your code

I have been hacking on a project in Node.js/Express.js for some time now. I am really impressed by how fast it is to code and prototype in Node, not much gets in your way. However, there is one thing I am still struggling with, the callback model of Node.js applications. It is not that it is conceptually hard to understand or use, but i feel that it keeps me from writing clean code.

Let’s imagine that we are writing a small nonsense program. The program receives a post, create some variable x, if some condition is true we call an async method to get some result and assign it to x.result. In the end we want to save x (also async). This would probably be my first attempt in node:
 

app.post('/someurl', function(req, res) {
var x = {..}

if (some contidion) {
  someOperation(function(err, result)) { 
    x.result = result
    x.save(function(err, result) {
       if (err) next(err)
       res.redirect(url)
    })
  })
} else {
  x.save(function(err, result) {
    if (err) next(err)
    res.redirect('/')
  }
}

 

(Edit: To clarify, both someOperation and save is doing some kind of I/O)

So is this clean code? In my opinion it is not. One problem is that we have the save code in two places. Another is that the code is not very easy to read and understand (remember this is a very simple program and so should the code), but it is not. Sure we could move it to a separate function, or refactor it in some other way, that would help some, but if we are still saving in two places, or if the code is not crystal clear, we are not really addressing the problem.

Let’s instead pretend that we are using synchronous methods (which of course would be a very bad thing in Node.js, so don’t do it) and save are methods that would block:
 

app.post('/someurl', function(req) {
var x = {..}

if (some contidion) {
  x.result = someOperation() //synchronous
}

x.save() // synchronous
return Redirect('/')

 
To me this is much cleaner and easier to read, saving is only done in one place and the intent of the code is very clear. But why is it so much easier to read? Being synchronous? I wouldn’t say so. The reason is because we do not have to deal with verbose callbacks. In my opinion callbacks are polluting the code. So now you are probably thinking, how can he be programming in Node if he doesn’t like using callbacks? I would say that, at least for the small applications i have build on Node so far, the benefits of simplicity and development speed on Node has totally justified using Node. Even with all the horrible callbacks 🙂 However, in a larger application I am not so sure, it would be very interesting to try out.

So is there anyway we can get rid of the verbose callback syntax? Yes, if some modifications were made to Javascript. But Javascript as a language is not really evolving that fast, so in the meantime it could probably be implemented in a language that compiles to Javascript, like CoffeScript. What we want to do is to be able to do is asynchronicity, but without callbacks. In C# 5 there is a great solution to this problem, the async/await keywords. By using async/await the application could be implemented something like this (pseudo code warning again):
 

private async Task Post(req)
{
  var x = new ...;

  if (some contidion) {
    x.result = await someAsyncOperation();
  }
  
  await x.save()

  return new Redirect('/');
}

 
In short, async tells the compiler that the above method is asynchronous and await will suspend execution of the method until someAsyncOperation is done.

So the above example behaves in exactly the same way as the first example, but with the benefits of the second. In the background a lot of magic is used to construct this behavior, but this is something that the compiler will take care of for us and we don’t have to pollute our code with.

If something like this would be available for Javascript I think it would really make life a lot easier for a lot of Node.js developers. Maybe there is a good solution to this and I am not aware of it (I am far from a Javascript or Node.js expert), if so please enlighten me!

How do you handle flow control in your Node.js applications? How would you have written the example? Any Node.js magic I am missing?

Peter

32 thoughts on “Node.js: Callbacks are polluting your code

  1. This issue goes out of control sometimes, leading to dozens of nested functions and stuff.

    CoffeeScript’s syntax makes it a little bit easier to deal with it but it’s still no solution. I think it’s too late for JavaScript for this but maybe, CoffeeScript or some other dialect might provide c# 5 style async handling.

  2. I did checkout step.js a while back, better than using nested callbacks, but it still seems kind of verbose.

  3. Shankar:
    After reading a little about async.js, the waterfall method seems to be the best pick. An implementation could maybe look something like:

    app.post(‘/someurl’, function(req, res) {
    
        var x = {..}
    
        async.waterfall([
            function(callback) {
                if (some condition) {
                    someOperation(function(err, result)) {
                        if (err) next(err)
                        x.result = result
                        callback(null, x)
                    })
                }
                else
                {
                    callback(null, x)
                }
            },
            function(x, callback) {
                x.save(function(err, result) {
                    if (err) next(err)
                    res.redirect(‘/’)
                })
            }
        ])
    }

    But this still looks like a mess to me… How would you have implemented it?

  4. Hi Peter,

    this how i would write it

    app.post(‘/someurl’, function(req, res) {
    	var x = {..}
    	
    	var stack = [];
    	
    	stack.push(function(callback){
    		if(some condition){
    			someOperation(function(err, result) {
    				callback(err, result);
    			});
    		}else{
    			callback(null, null);
    		}
    	});
      
    	stack.push(function(result, callback){
    		if(result){
    			x.result = result
    		}
    		
    		x.save(function(err, result) {
    			callback(err);
    		});
    	});
    	
    	async.waterfall(stack, function(err){
    		if (err) next(err)
    		res.redirect(‘/’);
    	});
    }
    

    Cheers

  5. someOperation is just supposed to represent a async method. Could be a database operation, cache read, or pretty much anything.

  6. Regarding middleware, what if we are not using express and run into the same “problem”, that we need to implement this very simple sequential flow? And what if the complexity of the flow increases? The methods above will grow into full monsters.

  7. Okay, if the someOperation is going be called for more than one route, i would suggest to move it to route specific middleware. It would make the code cleaner and you don’t have to change it in a million places if you decide to change the function.

  8. if you are not using express, you can use async to do flow control and “And what if the complexity of the flow increases?” is a hypothetical question, it is hard to answer without specifics. Generally you need to re-think the flow

  9. Yes, like I wrote in the post, this is a nonsense program, so the question is very much hypothetical.

    My point is that even a very simple flows can get tricky too implement in a good way in node. This is not nodes fault, this is because that Javascript lacks supporting language constructs for async operations.

  10. From my point of view, I don’t agree that Javascript lacks supporting language constructs for async operations. Javascript is being great in handling async events in browser. I agree that very simple flows can get ‘tricky’ to implement for programmers who are coming from sequential coding background (including me). We are used to looking at the code sequentially, and async coding in javascript totally contradicts with the way we code. You kind of experience a brain melt for the first few times you code async-ly in Javascript, but you will get the hang of it if you keep trying and as time goes by you will learn to write cleaner code.

  11. “Lets instead pretend that we are using synchronous methods (which of course would be a very bad thing in Node.js, so don’t do it)”

    Not it wouldn’t. Use synchronouse methods if what you’re doing doesn’t have a chance of blocking. It’s only when there’s a chance of blocking or prolonged waiting (such as during I/O operations) that async is useful. Otherwise don’t waste your silly little brain with it.

  12. Tor: Of course I was referring to potentially blocking methods, such as I/O. In this example both someOperation and save can be considered to perform I/O.

  13. I’m still just getting into managing this kind of control flow with Node.js myself – if you have complicated sequnces of callbacks which can branch and merge, async.auto makes them much easier to follow, but I’ve encountered a similar flow to above while doing form validation.

    One thing you can do is to create a function which contains the repetitive bit of work to be done, be it saving an object, or redisplaying a page, and call the function where necessary.

    I would probably also invert the logic to keep the function flat:

    app.post(‘/someurl’, function(req, res, next) { 
      var x = {..}                                  
                                                    
      function save() {                             
        x.save(function(err, result) {              
          if (err) return next(err)                 
          res.redirect(‘/’)                         
        })                                          
      }                                             
                                                    
      if (!(some condition)) return save()          
                                                    
      someOperation(function(err, result)) {        
        if (err) return next(err)                   
        x.result = result                           
        save()                                      
      })                                            
    })
    
  14. “This is not nodes fault, this is because that Javascript lacks supporting language constructs for async operations.”

    Absolutely wrong. You can write asynchronous or synchronous code in JS very easily. The reason you’re upset is because the libraries that you’re using are all written to be done using callbacks asynchronously. If you want synchronous db handlers, use them. Why are you writing code in a primarily event-driven language and complaining that it’s event-driven?

  15. I am not complaining that it is event driven, I am just saying that event driven programming in JavaScript could be made much easier.

    Synchronous db handlers is not really an option when using Node.

  16. Yeah, let’s go ahead and blame node.js because obviously it’s making you a bad coder and there’s nothing you can do about it.

  17. At this point a control-flow library will only introduce complexity. Just remove redundancy:

    app.post('/someurl', function(req, res) {
    
        var x = {}
        
        var afterSave = function(err, result) {
            if (err) next(err)
            res.redirect('/')
        }
        
        if (!condition) return x.save(afterSave)
        
        someOperation(function(err, result){
            x.result = result
            x.save(afterSave)
        })
    
    }
  18. Even at the risk of making a complete fool out of myself by now:

    [...]  
    
    if (some contidion) {
        x.result = await someAsyncOperation();
      }
     
      await x.save()
    [...]
    

    If what happens here obviously is synchronous (as, using “await”, things don’t seem to move any further until x.save() or someAsyncOperation() has finished), why would I want to make it asynchronous, after all?

  19. gin5eng: I don’t blame Node anywhere in the post, if you think so you have clearly misunderstood the post. If anyone would be to blame it would be Javascript, but I rather see this as a potential improvement of Javascript.

  20. KR: Sure the code does not move forward, but the executing thread does not block. When calling await it will continue processing other requests, and when someAsyncOperation is done it will resume execution of this method on the statement after await.

  21. Peter, I absolutely agree with your post. In my opinion, asynchronous method calls should be supported by the language internally, just as C# does this with async/await
    (It seems that tamejs does exactly that, using a somewhat quirky syntax).

    As far as I understand it, the need for callbacks arises from the wish to better use computing resources: The code is divided into small blocks of non-blocking code, that optionally end with a call to a function that is blocking.

    When structuring the code like this, the runtime will only need to create a thread or two for each core the computer has, and these will get these small chunks of code from a queue, process them, and give the blocking pieces back to the queue manager, which will add the callback to the queue when the blocking call ends.

    When you tell the language which functions are blocking, you can write your code in a synchronous/sequential style, while the compiler does the work of making it work efficiently.

  22. Great post, my sentiments exactly. Given Microsoft’s involvement with node.js recently I wouldn’t be surprised if Glenn Block & co are busy working on an asynchronous version of .NET.

  23. Dealing with events has two primary fashions: imperative (synchronous) code, and callback (asynchronous) code.

    Both are legitimate styles, for some situations, and none is universally better than the other.

    If you are dealing with event sequences that are predicated in some way (only go in one direction, delay until another event has finishes, HTTP, etc.) then synchronous code is a superior choice for clarity. It expresses the core idea of the algorithm very cleanly.

    If you are dealing with receiving many events at the same time (like keyboard/mouse input etc.) and the events are not predicated (or not much) callback styles are often a better choice.

    Which style you use should not be dictated by your language/runtime environment and technical issues, but by your usecase.

    An issue with imperative/synchronous semantics arises if the only way to service this semantic is blocking system calls (like recv). You can use threads to migitate this problem, but in its stead just introduces locks, raise conditions and blocking semaphores.

    A better way to provide the ability to handle events in an imperative/synchronous fashion is to use non-blocking system calls (connect ex, set blocking to false etc.) and use co-routines to schedule micro-threads depending on event state. There have been a variety of blog posts and papers about this, which consistently show that for I/O bound tasks, this is a far superior model because it offers cleaner code than both callback/asynchronous events and threaded programming, while also enjoying speed benefits forgone by threads (due to locking).

    There are three primary ways that co-routines can be implemented:
    1) As an native code level instruction pointer switch, an example of this would be pythons greenlet implementation.
    2) As a property of the virtual machine running the interpreted language, an example of which would be rubys early thread model (didn’t use actual threads).
    3) As a preprocessor to the virtual machine input that translated synchronous marked up codelines to asynchronous code.

    These options are not entirely equivalent and some have advantages over others.
    1) Implementing micro-threads/greenlets/co-routines on a native code level is quite fast. It also automatically solves the case that a virtual machine level callback is performed in a native-code part which was not designed to support micro threads. However there might be messups of the VM state.

    2) Implementing them as a property of the VM is a very safe and relatively easy thing to do, however inbetween native-code parts and callbacks to virtual machine code, the micro-threads could get broken by inconsiderate native code.

    3) Implementing them as a preprocessor is a very inferior choice, because it requires code path inference. In other words, if A calls B and B calls C and C is defined as a co-routine, then B must know that C is a co-routine, which means that A must know that B is a co-routine. Since this is not possible in JS, every call would have to be a potential co-routine, which would be extremely slow and would produce extremely convoluted code (compared to relatively straightforward code as produced by Coffeescript).

    However the problem is solved in the end, sticking your head in the sand is not the solution.

  24. this is Alex MacCaw’s answer to the problem you describe: https://github.com/maccman/ace. It is quite elegant.

    It uses this: https://github.com/laverdet/node-fibers
    which is a convenient way to get around callback spaghetti and allows your programs to selectively ‘exit’ blocking code.

    I also happen to prefer the callback method of programming. It makes more sense to me. To some people it doesn’t. It probably makes sense to Ryan Dahl and TJ Holowaychuk.

    I suggest you write an alternate framework and use it. Maybe other people will use it as well.

Leave a Reply

Your email address will not be published. Required fields are marked *