Hi folks, these are my public notes on an awesome workshop deep dive into the internal of Node.js with Will Sentance.
Hope you will find something helpful here.
JavaScript, Node, and The computer
We had to better understand JavaScript to understand Node.js
It’s a language that does 3 things, and 1 involves a lot of help from C++
Saves data and functionality (code)
Uses that data by running functionality (code) on it.
Has a ton of built-in labels that trigger Node features that are built in C++ to use our computer’s internals
Let’s see JavaScript other Talent - built-in labels that trigger Node features
We can setup with JavaScript label, a Node.js feature and computer internals to wait for requests for html/css/js from our users.
But how?
The most powerful built-in Node feature of all http
and its associated built-in label in JS - is also http
conveniently
- Using
http
feature of Node to setup an open socket
const server = http.createServer();
server.listen(80);
Inbound web request → run code to send back message
If inbound message → send back data
But at what moment?
Using Node APIs
Node auto-runs the code (function) for us when a request arrives from a user
function doOnIncoming(incomingData, functionsToSetOutGoingData) {
functionsToSetOutGoingData.end("welcome to our server!")
};
const server = http.createServer(doOnIncoming);
server.listen(80);
We don’t know when the inbound request will come - we have to rely on Node to trigger JS code to run
People often end up using
req
andres
for the parametersJavaScript is single-threaded & synchronous. All slow work (ex: speaking to the DB) is done by Node in the background
2 parts of calling a function - executing its code and inserting input (arguments)
Node not only will run our function at the right moment, it will also automatically insert whatever the relevant data as the additional argument (input)
Sometimes, it will even insert a set of functions in an object (as an argument) which gives us direct access to the message in Node, being sent back to the user, and allows us to add data to the message
Messages are sent in HTTP format - The “protocol” for browser-server interaction
- HTTP Message: Request line (url, method), Headers (metadata about the request), Body (optional)
Our return message is also in HTTP format
We can use the body
to send the data and the headers
to send important metadata.
In the headers
we can include info on the format of the data being sent back - for example it’s html
so to load it as a webpage.
Events and Error Handling
In server-side development, do we get errors?
It’s understandable cause we’re interacting with others’ computers over the internet.
There’s a lot of issues that could arise.
How can we handle this?
We need to understand our background Node http
server feature better.
Node will automatically send out the appropriate event depending on what it gets from the computer internals (http message or error)
function doOnIncoming(incomingData, functionsToSetOutGoingData) {
functionsToSetOutGoingData.end("welcome to our server!")
};
function doOnError(errorInfo) {
console.error(errorInfo);
};
const server = http.createServer();
server.listen(80);
server.on('request', doOnIncoming);
server.on('clientError', doOnError);
Reading from the File System with fs
Importing tweets with fs
function cleanTweets(tweetsToClean) {
// code that removes bad tweets
}
function useImportedTweets(errorData, data) {
const cleanedTweetsJson = cleanTweets(data);
const tweetsObj = JSON.parse(cleanedTweetsJson);
console.log(tweetsObj.tweet2);
}
fs.readFile('./tweets.json', useImportedTweets);
Every file has a
path
(a link - like a domestic url)JSON is a JS-ready data format
Here we want to use the JavaScript labels for Node C++ features that’s written in C++, that do have access to our file system - at the computer's internal features - operating system level.
fs.readFile('./tweets.json', useImportedTweets);
The auto-run function useImportedTweets
will be executed when the tweet.json
is finished being read.
Streams in Node
- Streams in Node or in Computer science in general is Chunks of data
What if Node used the event
message-broadcasting pattern to send out a message (event) each time a sufficient batch of the JSON data had been loaded in
And at each point, take that data and start cleaning it - in batches
let cleanedTweets = "";
function cleanTweets(tweetsToClean) {
// algorithm to remove bad tweets from `tweetsToClean`
}
function doOnNewBatch(data) {
cleanedTweets += cleanTweets(data);
}
const accessTweetsArchive = fs.createReadStream('./tweetsArchive.json')
accessTweetsArchive.on('data', doOnNewBatch);
We can break down any inbound flow of data into chunks.
On each chunk, run a function on it, to do it in that batch.
The call stack, event loop and callback queue in Node
- All the stuff that we’re relying on Node to autorun at some certain time and put it back into JavaScript again, that stuffs all get priority AFTER all regular JS code is run.
function useImportedTweets(errorData, data) {
const tweetsObj = JSON.parse(data);
console.log(tweetsObj.tweet1);
}
function immediately() {console.log("Run me last")};
function printHello() {console.log("Hello")};
function blockFor500ms() {
// Block JS thread DIRECTLY for 500ms
// with e.g a for loop with 5000 elements
}
setTimeout(printHello, 0);
fs.readFile("./tweet.json", useImportedTweets);
blockFor500ms();
console.log("ME FIRST");
setImmediate(immediately);
First
setTimeout(printHello, 0);
At 0ms, this timeout is finished, but the printHello
function will not be pushed on top of the callstack and executed immediately.
Instead, it will be registered on a Timer Queue to run later.
Next,
fs.readFile("./tweet.json", useImportedTweets);
With the help of Node C++ Features, and Libuv, we can access the internal file system at the operating system level.
But the readFile
will also run non-blocking in the background, which will not be resolved immediately.
Let’s move on
blockFor500ms();
Here we have a regular JavaScript function, that will run and take a long time to finish, it will block the JS main thread of execution for literally 500ms.
When this function is executed, it will have its execution context, popped on top of the callstack.
At this time, our fs.readFile
finished, we got the data comeback from the file system.
But at this moment, the auto-run callback function attached to the readFile
is also NOT be executed immediately, it will be registered and wait in the IO Queue.
By 95% of the functions you’re gonna have autorun in Node, will end up in this queue. For instance, anything involved in data from the file system, a network socket,…
AFTER 500ms,
Our blockFor500ms
finished, its execution context poped off the call stack.
NOW, we move to the next line
console.log("ME FIRST");
We have a log of ME FIRST
LAST LINE
setImmediate(immediately);
By the name itself, maybe you thought this line would be executed IMMEDIATELY?
NOPE!
This function will be ABSOLUTE OPPOSITE of running immediately
It’s the WORST named function in all of history
It certainly is NOT gonna run immediately, it will be the LAST QUEUE to be check.
The immediately
function will be registered in the Check Queue
Now, we have 0 things left on the global code, and 0 thing on the callstack.
HERE IS the time for Event Loop to come into play!
FIRST, we check the Timer Queue. Push the printHello
on top of callstack, and execute it. Then log Hello
to the console
SECOND, check the I/O Callback Queue, useImportedTweets
will be executed, with auto-imported data from Node, specifically, Error = null, and data
, log the content of tweet1 to console.
LAST, check the Check Queue
and put the immediately
on top of callstack, execute it.
Log run me last
Rules for the automatic execution of the JS code by Node
Hold each deferred function in ONE of the task queues when Node background API “completes”
Add the function to the Call Stack (i.e execute the function) ONLY when the call stack is totally EMPTY (Have the event loop check this condition)
Prioritize tasks in the MicroTask Queue OVER Timer queue over IO queue, over Check (immediately) queue, and over the Close queue.
Any close event, with associated functions will go into a Close Queue
In the MicroTask Queue, we have 2 smaller one
a) for process.nextTick
(we’re not used anymore)
b) for any function delayed using Promises - they get stuck in this one here.
In between the Event Loop doing the check on all other Queue, it will always go back and check the Micro Task Queue before it moves on to check next queue.