In my previous article, WebRTC - What the heck?, we met two shopkeeper farmers, John and Finch, who trade goods across a river using detailed letters (SDPs). If you haven't read it, I'd strongly recommend you read that first.
I promised we'd get into the technicals. Here it is. But don't worry, no code. We're sticking with John and Finch to understand what really happens under the hood.
The postal service - Signaling server
In the first article, Finch sent his letter via a pigeon. But John and Finch can't just throw letters across the river and hope it lands. Someone has to carry the letter. That's the postal service. It doesn't read the letter. It doesn't care what's inside. It just delivers.
In WebRTC, this is the Signaling Server. It carries the SDP offer from one client to the other and brings back the SDP answer. It could be a WebSocket server, an HTTP server, or anything that can pass a message between two endpoints. WebRTC doesn't define how signaling should work. It just says you need one.
The key thing? The postal service helped John and Finch start trading but the actual goods moved over the bridge directly between them. The postal service never carried the goods. Same with the signaling server, it never touches the actual audio or video data.
Finding the best route - ICE candidates
What if the main bridge is blocked because of heavy snow? Does trading stop? Not necessarily. There could be a smaller bridge downstream, a longer route through a neighbouring town, or even a ferry service.
In WebRTC, these possible routes are called ICE candidates (Interactive Connectivity Establishment). Each client gathers all possible routes:
- A direct local path (the main bridge on a sunny day)
- A path discovered through STUN that reveals your public address (like asking a traveller, "how do I get to the other side?")
- A relay path through TURN that forwards your goods when no direct bridge exists (hiring a middleman boat service)
Both sides share their route options alongside their letters, pick the best one and start moving goods. This process is called ICE gathering.
The offer and answer - What really happens
Here's the step-by-step when Finch decides to start trading:
- Finch creates an offer - writes a detailed letter with his address, what he can send, and all routes he knows. He keeps a copy for himself. In WebRTC: creating an offer and setting the local description.
- Postal service delivers - carries the letter to John. In WebRTC: the signaling server delivers the SDP offer.
- John reads and remembers - notes down Finch's details. In WebRTC: setting the remote description.
- John creates an answer - writes his own letter in the same format with his details and routes. Keeps a copy. In WebRTC: creating an answer and setting the local description.
- Postal service delivers again - John's answer reaches Finch.
- Finch reads and remembers - notes down John's details. In WebRTC: setting the remote description.
Now both sides know their own details (local description), the other person's details (remote description), and all possible routes (ICE candidates). They pick the best route and trade directly. The postal service's job is done.
Adding goods to trade - Media tracks
Before writing the offer, Finch has to take stock of what he actually has available on his farm. He might have fruits and grains ready but his spices might not be in season yet. He can only offer what he has. And here's the important part, John might look at the offer and say, "I'll take the grains but I don't need the fruits." The other side can decline items.
In WebRTC, these goods are media tracks. Before creating an offer, the client checks what's available, camera, microphone, screen, and adds them as tracks. But the user might deny camera access or the other client might reject a video stream. Not every track offered gets accepted.
Now say midway through trading, Finch starts producing silk and wants to add it to the deal. He writes a fresh offer letter including silk alongside the existing goods. John reviews it and sends back an updated answer. In WebRTC, this is exactly what happens when you start a screen share during an ongoing call. A new track gets added and an SDP renegotiation (new offer/answer exchange) happens to accommodate it.
The connection lifecycle
Quick Summary
- Signaling server = postal service. Delivers letters, never carries goods.
- ICE candidates = all possible routes. Both sides share and pick the best.
- STUN = asking for directions. TURN = hiring a middleman.
- Local description = your own letter. Remote description = the other person's letter.
- Media tracks = goods you take stock of before offering. The other side can decline.
All of this happens in milliseconds. The goods (audio, video, data) flow directly between the two sides without the signaling server ever touching them. Next time someone mentions RTCPeerConnection, you'll know it's just John and Finch figuring out the best bridge to trade over.

Comments
Post a Comment