Corey Butler's answer to How do Node.js APIs serve millions of requests at a time?

How do Node.js APIs serve millions of requests at a time?

Former Node Foundation Member/Speaker, Made NVM for Windows · Upvoted by

, Master Degree Software Engineering & Computer Science, Concordia University, Montreal · Author has 1.9K answers and 5M answer views · Updated 5y ·

Scaling to serve millions of API requests at a time is typically not related to the Node.js runtime or any programming language.

There are a whole bunch of factors that impact request throughput. Efficient program design is a factor, but network infrastructure is a larger factor at the scale of “millions at a time”.

Consider the following “simplistic” diagram:

API requests are commonly made to a pretty domain name, like api.domain.com. They must first be resolved to an IP address, which is the responsibility of DNS servers. Scaling DNS is a different matter, but it is a factor that typically has nothing to do with Node.js.

Once DNS is resolved, the request knows where it needs to go (because it now has the IP address). For scalable environments, this is usually a load balancing server.

Load balancers are often used to determine which server is best suited for replying quickly. In larger environments, load balancers tend to forward to proxy servers. A proxy server may sit in a different physical place, like a data center in Europe vs one in America. In each of those data centers, there may be multiple copies of the same Node.js API running. The proxy server determines which is most capable of handling the request and forwards it accordingly.

This is an overly simplified example of how high volume traffic flows through a scalable network. If it seems complex, then focus on two take away points:

The load is spread across multiple servers.
Scaling is a different discipline from writing API logic.

Spreading load across servers is typically a reaction to hardware capacity constraints. In other words, when a physical server can’t keep up, add more. The process of “adding another server” is very different from writing API logic.

Bottom line: Scaling to millions of requests at a time is a network operations activity, regardless of language/runtime.

110.3K views ·

View upvotes

View 32 shares

· Answer requested by

Quora User

1 of 11 answers

Something went wrong. Wait a moment and try again.

View 10 other answers to this question

About · Careers · Privacy · Terms · Contact · Languages · Your Ad Choices · Press ·