Jacob Deichert


How to Block the Event Loop in Node.js

Early last year I was having a difficult time tracking down a GRPC-related request timeout in one of our Node services. Our Node services all chatted over GRPC and some requests were randomly failing it seemed. This had been happening for months but had recently became more frequent. The specific error was very generic and hard to find any meaningful solutions when searching for it... others had triggered this error a number of ways unrelated to our setup.

At some point we began to think it had something to do with the Node event loop being blocked by some of our heavier requests. However, it was near impossible to trigger locally due to the small amount of data we had available and we still weren't exactly sure which area of the heavier request's logic was causing it. This is when we decided to manually block the event loop.

// This will block the event loop for the specified number of seconds
const blockEventLoop = (durationSeconds) => {
    console.log('START BLOCKING', Date.now());

    const endTime = Date.now() + (durationSeconds * 1000);

    while (Date.now() < endTime) {
        Math.random();
    }

    console.log('STOP BLOCKING', Date.now());
};

Usage

This will block the event loop for 60 seconds. You should invoke this inside some request handler after invoking another handler that contains a bunch of async code.

const requestHandlerWithAsyncLogic = () => {
    return doLotsOfAsyncCalls();
};

const requestHandlerThatBlocks = () => {
    blockEventLoop(60)
};

Finding the Real Blockers

After blocking the event loop, we were able to locally reproduce the exact timeout scenario that was happening in production. This was enough validation to prove we knew what the problem was... we just had to find who was causing it within our heavy async chain.

By placing timers around some of the loops we expected could be causing this, we narrowed it down and fixed them accordingly. A combination of breaking up the loops into chunks and turning certain chains into jobs outside of the request lifetime was all we needed to do!