How Netflix Streams Videos Without Buffering (The 8000 Server Secret)
You press play. 60 million others press play with you. Nothing breaks. The trick isn't AWS, it's 8,000 red boxes sitting inside your local ISP's office.
You hit play on Stranger Things at 9:47pm. So do 60 million other people. Nothing breaks.
Think about that for a second. A single click on your TV travels through your wifi, jumps across the public internet, hits some Netflix system somewhere, and within roughly 100 milliseconds you're watching a 4K stream that picks up exactly where you left off last week.
How does that even work?
I went down a rabbit hole this week trying to figure out what actually happens between the click and the picture. The answer is way weirder than I expected, and most of it has nothing to do with the cloud.
The problem nobody talks about
Here's the thing people miss when they imagine how Netflix works. They picture some massive AWS data center somewhere in Virginia pumping video out to the world.
That's not what's happening. Not even close.
If Netflix tried to serve video that way, the math would fall apart almost instantly. Roughly 15% of all internet traffic globally is Netflix. During peak hours in some countries it climbs higher. Trying to push that volume out of a few central servers would saturate the backbone of the internet itself.
So Netflix did something most companies never seriously consider. They stopped using the public internet for the actual video.
The 8,000 boxes you've never seen
Netflix has built and shipped over 8,000 custom servers called Open Connect Appliances to internet providers around the world. They're sitting inside Comcast offices, inside your local ISP's data center, sometimes within a few miles of your house.
Each one of these boxes holds up to 350 terabytes of video on its drives. Picture a heavy red rack server with no screen, no keyboard, just storage and network ports. That's it. Built for one job: shoving video down a wire as fast as physics allows.
Netflix gives them to ISPs for free. They've spent over a billion dollars on this hardware. Why would any company do that?
Because the alternative is worse for everyone. ISPs save bandwidth costs (one industry analysis put the savings at $1.25 billion for ISPs by 2021 alone). Netflix gets faster, cheaper delivery. You get a video that starts in 100 milliseconds.
When you press play, you're almost never talking to "Netflix." You're talking to a box owned by Netflix that lives inside the building your internet provider operates from.
The midnight trick
Okay so the boxes are everywhere. But how do they know which shows to have ready?
This part is what blew my mind a little. A normal CDN works reactively. Something gets popular, copies spread out, traffic catches up. Netflix runs the opposite playbook.
Every night, when traffic on the internet drops to a fraction of peak, Netflix's systems analyze viewing patterns and push content to local servers before anyone asks for it. They call this proactive caching, and it runs on machine learning models trained on years of regional viewing data.
If Netflix predicts that 200,000 people in Hyderabad are likely to start a new Telugu show on Friday, copies of that show get pre-loaded onto the appliances near Hyderabad on Wednesday night. By the time Friday comes, the data is already physically sitting two miles from the viewer.
This is the part that feels almost like cheating. The video isn't being delivered when you click play. It was delivered hours ago. You're just being told which warehouse to walk into.
What actually happens when you press play
Walk through the timeline. It's faster than blinking.
Your device sends a tiny request to Netflix's control plane. Not the video, just a question: "Hey, where should I get this?"
The control plane looks at where you are, which appliances near you have the show, how loaded each one is, and what device you're using. It picks the best one.
It sends your device a manifest, basically a recipe that lists every quality version of the video and where to grab the chunks from.
Your device then starts pulling chunks straight from that nearby appliance. Not from "Netflix." From a box that might be in the same building as your internet router's first hop.
All of this happens in roughly 100 milliseconds. The video starts. You don't think about any of it. That's the point.
The buffering problem (and why it mostly disappeared)
Real internet isn't smooth. Your wifi drops. Someone in your house starts a video call. The neighbor's microwave does whatever it does to 2.4ghz.
Netflix solved this with adaptive bitrate streaming. Instead of one perfect copy of a movie, Netflix encodes around 6 to 10 different versions: 4K, 1080p, 720p, 540p, all the way down to a tiny version that works on dial-up speeds.
Your player isn't dumb. It's measuring your connection speed every couple of seconds. The moment your bandwidth dips, it quietly switches to a lower quality stream. When things recover, it climbs back up. You usually don't notice the swap.
There's something deeper happening here too. Netflix has been pushing a newer video codec called AV1, which now powers around 30% of all their streaming as of 2026. AV1 squeezes the same video into about a third less bandwidth than the older codecs, and Netflix's data shows 45% fewer buffering interruptions on AV1 streams compared to the old AVC codec. They're already talking about AV2 next.
So your player is doing two things at once. Picking the right quality version, and increasingly using a codec that needs way less bandwidth to look great in the first place.
Why this matters even if you don't care about Netflix
Here's the part that's more interesting than Netflix itself.
The same ideas, edge servers close to users, predictive caching, adaptive streaming, are quietly running most of the internet you touch every day.
YouTube works this way. Instagram reels work this way. Spotify works this way. Cloudflare and Fastly built businesses renting this exact pattern to anyone who wants it.
When you build software on the web today and you wonder why your app feels slow compared to Netflix or YouTube, this is usually why. They aren't shipping bytes from a single origin server. You probably are.
The lesson hidden in Netflix's architecture is pretty simple. Distance is latency. Network engineers can't beat the speed of light. The only way to make something feel instant is to put it physically close to the person asking for it, ideally before they even ask.
That's the whole trick. Put the data close, predict what people want, and have a fallback ready when the network gets weird.
The fact that one click delivers 4K video in 100 milliseconds isn't magic. It's eight thousand red boxes sitting in dark rooms a few miles from you, quietly doing exactly what was rehearsed last night.
Written by Curious Adithya for Art of Code.