Scaling to serve millions of API requests at a time is typically not related to the Node.js runtime or any programming language.
There are a whole bunch of factors that impact request throughput. Efficient program design is a factor, but network infrastructure is a larger factor at the scale of “millions at a time”.
Consider the following “simplistic” diagram:
API requests are commonly made to a pretty domain name, like api.domain.com
. They must first be resolved to an IP address, which is the responsibility of DNS servers. Scaling DNS is a different matter, but it is a factor that typically has nothing to do with Node.js.
Once DNS is resolved, the request knows where it needs to go (because it now has the IP address). For scalable environments, this is usually a load balancing server.
Load balancers are often used to determine which server is best suited for replying quickly. In larger environments, load balancers tend to forward to proxy servers. A proxy server may sit in a different physical place, like a data center in Europe vs one in America. In each of those data centers, there may be multiple copies of the same Node.js API running. The proxy server determines which is most capable of handling the request and forwards it accordingly.
This is an overly simplified example of how high volume traffic flows through a scalable network. If it seems complex, then focus on two take away points:
- The load is spread across multiple servers.
- Scaling is a different discipline from writing API logic.
Spreading load across servers is typically a reaction to hardware capacity constraints. In other words, when a physical server can’t keep up, add more. The process of “adding another server” is very different from writing API logic.
Bottom line: Scaling to millions of requests at a time is a network operations activity, regardless of language/runtime.