autoscale: true build-lists: true
- I'm Lewis, I like JavaScript (and other things)
- Made Bee-Queue, Redis-backed job queue for Node.js
- Like Celery, Resque, Kue, Bull
- Worked on raven-node, Sentry's Node error reporting SDK
- It captures and reports about 100 million errors per week
fs.readFile('myFile.txt', { encoding: 'utf8' }, function (err, data) {
console.log(data);
});^ if you've ever written node code, you've written node code like this
http.get(myUrl, function (res) {
doSomethingWith(res);
});try {
data = JSON.parse(userInput);
} catch (e) {
// this should never happen
}
doSomethingWith(data);fs.readFile('myFile.txt', { encoding: 'utf8' }, function (err, data) {
if (err) return console.error(err);
doSomethingWith(data);
});- Background and how Node is special/different
- Handle what you can, avoid what you can't
- The robust game plan to follow
- Exceptions, Callbacks, Promises, EventEmitters, and more
- Catching, reporting, shutting down, restarting gracefully
- Python:
try/exceptandraise - Ruby:
begin/rescueandraise - PHP:
try/catchandthrow - Lua:
pcall()anderror()
try {
throw new Error('boom!');
} catch (e) {
console.log('Aha! I caught the error!');
}try {
throw new Error('boom!');
} catch (e) {
console.log('Aha! I caught the error!');
}$ node try-catch.js
try {
throw new Error('boom!');
} catch (e) {
console.log('Aha! I caught the error!');
}$ node try-catch.js
> Aha! I caught the error!
try {
setTimeout(function () {
throw new Error('boom!');
}, 0);
} catch (e) {
console.log('Aha! I caught the error!');
}try {
setTimeout(function () {
throw new Error('boom!');
}, 0);
} catch (e) {
console.log('Aha! I caught the error!');
}$ node try-settimeout.js
try {
setTimeout(function () {
throw new Error('boom!');
}, 0);
} catch (e) {
console.log('Aha! I caught the error!');
}$ node try-settimeout.js
/Users/lewis/dev/node-error-talk/try-settimeout.js:3
throw new Error('boom!');
^
Error: boom!
at Timeout._onTimeout (/Users/lewis/dev/node-error-talk/try-settimeout.js:3:11)
at ontimeout (timers.js:365:14)
at tryOnTimeout (timers.js:237:5)
at Timer.listOnTimeout (timers.js:207:5)
- Try-catch is synchronous, setTimeout is asynchronous
- Callback is queued and will throw the error later
- Catch block is no longer waiting to catch the exception
- Run-to-completion semantics & event loop are behind this
- Other languages:
- Each process handles one request at a time
- Everything is synchronous, try/catch works fine
- Easy to keep one request from blowing everything up
- Node: single process, cooperative concurrency, asynchronous I/O
- Handles multiple requests at the same time in the same thread
- Try/catch doesn't work well in asynchronous world
- Any one request blowing up could mess up the others
- We need other mechanisms to handle asynchronous errors
- We can't
try/catchand throw exceptions for everything - An error in one request can take down the entire server
- We have to be extra careful to keep things online
- Our program can end up in unknown states
- Only correct thing to do might be shut down!
^ point out use of errors vs exceptions
Error is just a special class in JavaScript
- You can pass an
Errorobject around like any other value - Runtime errors
throwanErrorobject - Has a
messageand astackproperty - Also
RangeError,ReferenceError,SyntaxError, others
var myError = new Error('my error message');^ Mention engines have special treatment: V8 etc will populate stack trace based on line where Error is instantiated
Exception: what happens when you throw something
- Usually you throw an
Errorobject - Call stack unwinds looking for a
catchblock - You can throw anything, not just
Errorobjects...but don't - If an exception is unhandled, Node will shut down
throw new Error('something bad happened!');
throw 'something bad happened' // avoid this^ Mention more on unhandled later?
function a() {
// call stack here is [a]
b();
}
function b(x) {
// call stack here is [b, a]
try {
c()
} catch (e) {
console.log(e.stack);
}
}
function c() {
// call stack here is [c, b, a]
throw new Error('boom');
}
a();Error: boom
at c (/Users/lewis/dev/node-error-talk/try-catch.js:14:9)
at b (/Users/lewis/dev/node-error-talk/try-catch.js:7:5)
at a (/Users/lewis/dev/node-error-talk/try-catch.js:2:3)
at Object.<anonymous> (/Users/lewis/dev/node-error-talk/try-catch.js:17:1)
...
<more node core module frames>
- We're generally not going to throw Exceptions
- We're going to pass
Errorobjects around a lot
function (err, result) {
if (err) { /* drat, ignore */ }
}
try { ... } catch (e) {
// this should never happen
}
Promise.catch(function (reason) {
// surely this won't happen
});
req.on('error', function (err) {
// oh well, not gonna do anything
});- Operational error: recoverable - expect and handle these
- Typically an
Errorobject being passed around
- Typically an
- Programming error: nonrecoverable - try to avoid these
- Typically an Exception thrown
^ There's no amount of additional code you can write to recover from a typo
- Network timeouts
- Database is down
- Disk got full
- 3rd party API returning errors: S3 goes down
- Unexpected or missing user inputs (
JSON.parse)
^ Tubes get jammed ^ S3 goes down ^ Users do things you don't expect
- Silently ignoring/swallowing errors instead of handling them
- Classic JavaScript errors
TypeError: undefined is not a functionTypeError: Cannot read property 'x' of undefinedReferenceError: x is not defined
- Invoking a callback twice
- Using the wrong error mechanism in the wrong place
^ Linters can literally forbid you from ignoring errors
^ Any time you use a . you should feel uncomfortable: is it defined?
^ Don't throw when you should callback
-
Operational errors
- Known; handle manually wherever they may occur
- Recoverable if handled correctly: S3 being down doesn't kill us
- Avoid assuming anything is reliable outside your own process
-
Programming errors
- Unknown; catch with global error handler
- Nonrecoverable: we're gonna have to abandon ship
- No amount of additional code can fix a typo
- Use a linter to help avoid many common problems
^ Operational errors that we don't handle become programming errors ^ Highlight "unknown" bit about programming errors ^ No amount of additional code can fix a typo
- State shared across multiple requests: less isolation
- Unexpected error in one request can pollute state of others
- Polluted state can lead to undefined behavior
- Memory leaks, infinite loops, security issues
- Only way to get back to a known good state: bail out, restart
^ Understanding these guidelines based on what's unique about node leads us to...
- Follow the guiding principles
- Know and use different mechanisms for effective handling
- Have a global catch-all for the errors you couldn't handle
- Use a process manager so shutting down is no big deal
- Accept when it's time to pack up shop, clean up, shut down
^ In accordance with our three guiding principles... ^ - Always know when errors happen: don't ignore! ^ - Handle what you can, avoid what you can't ^ - Don't keep running in an unknown state
- Try/Catch -
throwandtry/catch - Callbacks -
errfirst argument andif (err) - Promises -
reject(err)and.catch() - Async/Await - Sugar for promises +
try/catch - EventEmitters -
errorevents and.on('error') - Express -
next(err)and error-handling middleware
^ Touched on first two, probably seen them before
function readAndParse(file, callback) {
fs.readFile(file, { encoding: 'utf8' }, function (err, data) {
if (err) return callback(err);
var parsed;
try {
parsed = JSON.parse(data);
} catch (e) {
return callback(e);
}
callback(null, parsed);
});
}var p = new Promise(function (resolve, reject) {
fs.readFile('data.txt', function (err, data) {
if (err) return reject(err);
resolve(data);
});
});
p.then(parseJson)
.then(doSomethingElse)
.catch(function (reason) {
// if readFile or JSON parsing or something else failed,
// we can handle it here
});^ reason should be an Error object but promise terminology is "reason" not "err"
try {
await somePromiseThatRejects()
} catch (e) {
// e is the rejection reason!
}- Servers, sockets, requests, streams
- Long-lived objects with asynchronous stuff going on
- They can emit
errorevents: listen for them! - If an
errorevent is emitted without anerrorlistener...- The
Errorobject will be thrown instead! - How operational errors become programming errors
- The
var req = http.get(url, function (res) {
doSomething(res);
});
req.on('error', function (err) {
// we caught the request error, let's recover
}app.post('/login', function (req, res, next) {
db.query('SELECT ...', function (err, user) {
if (err) return next(err);
});
});
app.use(function (req, res, next, err) {
// spit out your own error page, log the error, etc
next();
});^ Make sure to know how your libraries and dependencies will communicate errors so you can handle operational errors!
process.on('uncaughtException', function (err) {
console.log('Uncaught exception! Oh no!');
console.error(err);
process.exit(1);
});^ Remember: we have to know the error happened! ^ Don't try to recover at this point: it's too late
- Run multiple server processes
- One of them dying won't take us offline
- Node cluster module
- Process managers: systemd, pm2, forever, naught
- Will automatically restart processes when they die
- Some provide further Node-specific functionality
var server = http.createServer(...);
process.on('uncaughtException', function (err) {
console.log('Uncaught exception! Oh no!');
console.error(err);
// tell naught to stop sending us connections & start up a replacement
process.send('offline');
process.exit(1);
});
server.listen(80, function () {
// tell naught we're ready for traffic
if (process.send) process.send('online');
});^ more later on things shutDownGracefully might do
- Quit accepting new connections
- Start reporting whatever we're gonna report
- Tell proc manager we're gonna die so it starts replacement
- Wait for any existing requests, sockets, etc to be dealt with
- Close any open resources, connections, etc
- Shut down
function reportError(err, cb) {
// send the stack trace somewhere, then call cb()
}
process.on('uncaughtException', function (err) {
process.send('offline');
reportError(err, function (sendErr) {
// once error has been reported, let's shut down
process.exit(1);
});
});var myPhone = "..."
function reportError(err, cb) {
console.error(err);
twilio.sendTextMessage(myPhone, err.message, cb);
}db.query('SELECT ...', function (err, results) {
if (err) {
return reportError(err);
}
doSomething(results);
});The raven npm package is Sentry's Node error reporting SDK:
var Raven = require('raven');
Raven.config('<my-sentry-key>');
function reportError(err, cb) {
console.error(err);
Raven.captureException(err, cb);
}server = http.createServer(...);
function shutDownGracefully(err, cb) {
// quit accepting connections, clean up any other resources
server.close(function () {
// could also wait for all connections: server._connections
reportError(err, cb)
});
}
process.on('uncaughtException', function (err) {
process.send('offline');
shutDownGracefully(function () {
process.exit(1);
});
});- Wait for any existing requests, sockets, etc to be dealt with
- Close any open resources, connections, etc
server = http.createServer(...);
function reportError(err, cb) {
console.error(err);
Raven.captureException(err, cb);
}
function shutDownGracefully(err, cb) {
// quit accepting connections, clean up any other resources
server.close(function () {
// could also wait for all connections: server._connections
reportError(err, cb)
});
}
process.on('uncaughtException', function (err) {
process.send('offline');
shutDownGracefully(function () {
process.exit(1);
});
});
server.listen(80, function () {
if (process.send) process.send('online');
});- Just log the error and carry on
- Keep the process running indefinitely
- Try to recover in any way: it's too late!
- Try to centralize handling of operational errors into one place
process.on('uncaughtException')process.on('unhandledRejection')- Currently non-fatal, warning starting in Node 7
- Future: fatal, will cause process exit
- Domains: application code shouldn't need them
- Follow the guiding principles
- Know and use different mechanisms for effective handling
- Have a global catch-all for the errors you couldn't handle
- Use a process manager so shutting down is no big deal
- Accept when it's time to pack up shop, clean up, shut down
- Run-to-completion semantics & the event loop
- V8 stacktrace API
- The "callback contract"
- Asynchronous stacktraces
- Cluster module & process managers
- Domains
- async_hooks (!!!!!)
^ Each of these bullets is at least a lightning talk worth of stuff
- https://sentry.io/for/node/
- https://www.joyent.com/node-js/production/design/errors
- https://nodejs.org/api/errors.html
- Slides available at GitHub.com/LewisJEllis/node-error-talk
- I'm Lewis J Ellis: @lewisjellis on Twitter and GitHub