OS & Networking
The OS and networking fundamentals every SDE interview pokes at, plus the canonical 'what happens when you type a URL' walk-through.
Free reference · last reviewed
The OS and networking questions in an SDE interview cluster around a few ideas: how processes and threads share (or isolate) memory, how virtual memory and the cache hierarchy work, and how a request travels DNS → TCP → TLS → HTTP. Below covers each, then ties them together in the canonical "what happens when you type google.com and press Enter?" walk-through.
OS: Process & Thread
Predict the pattern
Two execution units run inside the same program. Which one shares the heap with its siblings and can directly read their variables?
| Process | Thread | |
|---|---|---|
| Address space | own | shared with siblings in same process |
| Memory | private | shared heap; own stack per thread |
| File descriptors | own table | shared |
| Communication | IPC (pipe, socket, shared mem) | shared memory + sync |
| Crash | doesn't take down siblings | takes down whole process |
| Creation cost | high (fork + copy-on-write) | low |
Process lifecycle
fork() → child gets COW copy of parent's pages → exec() replaces program → exit() → parent wait()s to reap.
Zombie: child exited, parent hasn't wait()ed yet, still has PID entry.
Orphan: parent died first, adopted by init (PID 1).
Virtual Memory
- Each process sees a flat 64-bit address space; OS maps to physical frames via page tables.
- Page = fixed chunk (usually 4 KB).
- TLB = CPU cache for recent page-table lookups; a TLB miss → page walk.
- Page fault: requested page not in RAM.
- Minor: page in memory, not yet mapped → just update PT.
- Major: page on disk (swap or mmap'd file) → expensive disk I/O.
- Copy-on-write: after fork, parent + child share pages until one writes.
mmap
Map a file into memory; reads/writes go through the page cache instead of explicit read()/write(). Great for random access into big files.
Context Switching
Cost: ~1-10 µs typical; flushes TLB (on process switch), spills registers, loads new state.
Triggers:
- Preemption by scheduler (timer interrupt)
- Blocking syscall (e.g. waiting on I/O)
- Voluntary yield
Implication: 10K threads switching constantly = a lot of overhead. Async I/O avoids this by keeping a small pool of threads driving many connections.
Scheduling
- CFS (Linux default): each task tracks its "virtual runtime"; pick the smallest.
- RT scheduler for real-time tasks.
- Nice values (-20 to +19) influence priority.
File Descriptors & I/O
- Every open file/socket/pipe = a small integer per process.
read()/write()are blocking by default; can setO_NONBLOCK.- I/O multiplexing:
select,poll,epoll(Linux),kqueue(BSD/macOS) → "tell me which of these N fds is ready." epollis edge-triggered & scales to 100K+ fds → backbone of async servers.
ulimit -n
Max open file descriptors per process. Often defaults to 1024, raise to 65535+ for servers.
Memory Hierarchy & Caches
| Layer | Size | Latency | Notes |
|---|---|---|---|
| Registers | bytes | <1 ns | per core |
| L1 | ~32 KB | 1 ns | per core, split inst/data |
| L2 | ~256 KB | 4 ns | per core |
| L3 | ~MBs | 10 ns | shared across cores |
| RAM | GBs | 100 ns | |
| SSD | TBs | 100 µs | |
| HDD | TBs | 10 ms | |
| Network (DC) | - | 500 µs RTT | |
| Network (cross-continent) | - | 150 ms RTT |
Cache lines
Memory loaded in 64-byte lines. Two threads writing to different vars in the same line = false sharing → cache line bounces between cores.
Networking: The Stack
From memory: name each layer and one example protocol for each
| Layer | Examples |
|---|---|
| Application | HTTP, WebSocket, gRPC, DNS, SMTP |
| Transport | TCP, UDP, QUIC |
| Network | IP (v4, v6), ICMP |
| Link | Ethernet, Wi-Fi, ARP |
| Physical | cables, radio |
TCP
Predict the pattern
TCP guarantees delivery in order and retransmits lost segments. Which single word best captures this core property?
- Reliable, ordered, byte-stream, connection-oriented.
- 3-way handshake: SYN → SYN-ACK → ACK.
- 4-way close: FIN → ACK → FIN → ACK. Lots of half-closed states.
- Sliding window for flow control.
- Slow start + congestion avoidance (cwnd) for congestion control.
Common TCP states (netstat -an)
ESTABLISHED: open connectionTIME_WAIT: recently closed, holding port for 2× MSL (~60s) to absorb stray packets. Many inTIME_WAITafter benchmarks is normal.CLOSE_WAIT: peer closed, you haven't. If lots persist, your code isn't closing sockets.
Nagle's algorithm
Coalesces small writes to reduce packet count. TCP_NODELAY disables it for low-latency apps (gaming, finance).
UDP
Predict the pattern
A live video stream must minimize delay — a lost frame is better dropped than retransmitted a second later. Which transport protocol fits?
- Unreliable, unordered, datagram.
- No handshake, no retransmit, no congestion control.
- Use cases: DNS, VoIP, video, gaming, QUIC base.
- "Reliability when you want it" → built on top (QUIC, custom).
QUIC / HTTP/3
- Built on UDP.
- Combines TLS + transport in one handshake (1-RTT, often 0-RTT).
- No head-of-line blocking across streams (unlike HTTP/2 over TCP).
- Each stream has its own loss recovery.
- Used by YouTube, Google, Cloudflare.
DNS
- Hierarchical lookup: root → TLD (
.com) → authoritative → record. - Resolver does the work; OS caches; browser caches.
- Records:
A/AAAA→ IPv4 / IPv6CNAME→ alias to another nameMX→ mail serverTXT→ arbitrary (SPF, DKIM, verification)NS→ name server
- TTL controls how long resolvers cache.
- DoH / DoT: DNS over HTTPS / TLS, encrypted resolution.
Why slow DNS hurts
First request to a new host: TCP handshake + TLS + DNS lookup. Sometimes DNS dominates (300+ ms).
TLS (formerly SSL)
Goals: confidentiality, integrity, authenticity.
1.3 handshake (1-RTT)
- Client sends
ClientHello(supported ciphers, key share). - Server sends
ServerHello(chosen cipher, certificate, key share, finished). - Client validates cert chain, derives keys, sends
Finished. - Encrypted data flows.
Certs
- X.509 cert: identity + public key, signed by CA.
- Chain: leaf → intermediate(s) → root (in client's trust store).
- Common errors: expired, wrong CN/SAN, untrusted issuer, hostname mismatch.
Mutual TLS (mTLS)
Client also presents a cert. Used for service-to-service auth.
HTTP
HTTP/1.1
- Text protocol over TCP. Pipelining poorly supported in practice.
- One request at a time per connection → browsers open 6 per host.
- Headers verbose; no compression by default.
HTTP/2
- Binary, multiplexed streams over one TCP connection.
- Header compression (HPACK).
- Server push (mostly unused).
- Head-of-line blocking: if one TCP segment is lost, all streams stall.
HTTP/3 (QUIC)
- Solves HOL blocking by moving streams below TCP.
Methods, status codes, idempotency
See api-design.md.
Common headers
| Header | Purpose |
|---|---|
Authorization: Bearer ... | auth |
Content-Type: application/json | body MIME |
Accept: ... | what client can receive |
Cache-Control: max-age=N | caching |
ETag: "..." + If-None-Match: "..." | conditional GET |
Set-Cookie: | session |
Cookie: | client → server |
Host: api.example.com | which vhost |
User-Agent: | client identity |
X-Forwarded-For: | original client IP through proxies |
Content-Encoding: gzip | compressed body |
Cookies
- Attributes:
HttpOnly(no JS access),Secure(HTTPS only),SameSite(CSRF mitigation),Domain,Path,Expires/Max-Age. - Session vs persistent.
WebSocket
- Upgrade from HTTP, full-duplex, long-lived.
- Frames over TCP.
- Use cases: chat, live dashboards, multiplayer.
- Watch for: scaling (sticky sessions or pub-sub broker like Redis), proxy timeouts (need keepalive pings), backpressure.
TCP vs HTTP keep-alive
- TCP keepalive: kernel-level "are you still there?" probes; default disabled or hours.
- HTTP keep-alive: reuse the same TCP conn for multiple HTTP requests; default in 1.1.
NAT
- Router rewrites private IPs to public IP+port.
- Many devices share one public IP.
- Connection-tracking table (~5min idle timeout) → why long-idle connections die behind NATs → why apps send pings.
Load Balancing (network view)
- L4 (TCP/UDP): forward by 5-tuple (src ip/port, dst ip/port, proto). Fast, opaque.
- L7 (HTTP): peek at URL/header to route. Slower, smarter (path-based routing, header rewrites).
- Algorithms: round-robin, least-conn, IP-hash, consistent hash.
- Health checks: poll backends; pull unhealthy ones out.
Latency budget
Rough numbers in a modern web stack:
- DNS resolution: 0-100 ms
- TCP+TLS handshake: 1-3 RTTs
- Server processing: target < 100 ms
- DB query: target < 10 ms (cached) / 50 ms (uncached)
- Network DC RTT: ~0.5 ms
- Cross-continent RTT: ~150 ms
If you have a 100 ms latency budget, you can do ~5 sequential cross-DC calls. Plan accordingly (parallelize, cache, prefetch).
"What happens when you type google.com and press Enter?"
The interview classic. Walk through:
- Browser parses URL → scheme (
https), host, port, path. - Check HSTS: is host on the always-HTTPS list?
- Check browser cache for the response.
- DNS lookup:
- Browser cache → OS resolver cache → router → ISP resolver → root → TLD → authoritative → IP.
- TCP handshake to IP:443 (SYN, SYN-ACK, ACK).
- TLS handshake (1-RTT in 1.3): cipher negotiation, cert validation, key derivation.
- HTTP request:
GET / HTTP/2, headers (Host, Cookie, Accept, etc.). - Server side:
- LB picks backend.
- Backend hits cache / DB / other services.
- Renders response (HTML).
- HTTP response: status, headers, body. Possibly gzip-compressed.
- Browser parses HTML, finds
<link>,<script>,<img>→ kicks off more requests (often parallel, often to same host using same TCP conn under HTTP/2). - Render: DOM → CSSOM → render tree → layout → paint → composite.
- JS execution, hydration, more fetches, etc.
Show you know layering. Interviewers probe wherever you're shallow.
Useful CLI debugging
# DNS
dig google.com
nslookup google.com
host google.com
# TCP connectivity
nc -vz host 443
telnet host 443
# Trace network path
traceroute google.com
mtr google.com
# HTTP request
curl -v https://example.com/path
curl -I https://example.com # HEAD only
curl --resolve example.com:443:1.2.3.4 ... # bypass DNS
# TLS inspection
openssl s_client -connect host:443 -servername host
# Sockets / listening ports
ss -tlnp # who is listening?
ss -antp # all TCP, with PID
lsof -iTCP:8080 # what's on port 8080?
lsof -p <pid> # everything that process has open
# Packet capture
sudo tcpdump -i any -nn 'tcp port 8080'
sudo tcpdump -A -i any 'tcp port 80' # ASCII bodies
Quick Q&A
Q: TCP or UDP for video calling? Mostly UDP (often via WebRTC/SRTP), losing a frame is fine; retransmitting it after RTT is worse than dropping. Signalling channel may be TCP.
Q: Why is the first request to a new site slow? DNS + TCP + TLS handshakes; possibly cold cache at every layer. Subsequent requests reuse the connection.
Q: How does HTTPS keep my data secret? TLS handshake derives symmetric session keys via asymmetric crypto. Server proves identity via cert chain. Symmetric encryption + MAC on every record. Replay protection via sequence numbers.
Q: What's the difference between 127.0.0.1 and 0.0.0.0?
127.0.0.1(loopback): only this machine can connect.0.0.0.0(all interfaces): bind to every network interface, anyone routable can connect.
Q: What's a SYN flood? DoS where attacker sends many SYNs, never completes handshake; server fills its SYN queue. Mitigation: SYN cookies, rate limits.
Q: Why does my server show many TIME_WAIT after load test?
That's normal. The closing side holds TIME_WAIT for 2 MSL. If problematic, use connection reuse, SO_REUSEADDR, or a reverse proxy.
Q: How do you debug "service unreachable"?
- DNS resolves? (
dig) - TCP connects? (
nc -vz) - TLS works? (
openssl s_client) - HTTP responds? (
curl -v) - Firewall / SG? (Cloud console /
iptables) - Service running? (
systemctl,ps) - Listening on right interface/port? (
ss -tlnp)
Q: What does EADDRINUSE mean?
Port already bound (often a previous instance in TIME_WAIT). SO_REUSEADDR, or kill the previous process.
Q: 32 vs 64-bit address space implications? 32-bit: ~4 GB virtual per process (less usable). 64-bit: 256 TB+ virtual; effectively unlimited for app-level use.
Memorize these defaults
| Thing | Default |
|---|---|
| HTTP port | 80 |
| HTTPS port | 443 |
| SSH | 22 |
| DNS | 53 (UDP, TCP for large) |
| SMTP | 25 / 587 (sub) / 465 (TLS) |
| PostgreSQL | 5432 |
| MySQL | 3306 |
| Redis | 6379 |
| MongoDB | 27017 |
| Kafka | 9092 |
| RabbitMQ | 5672 |
| MTU (Ethernet) | 1500 bytes |
| TCP MSS | ~1460 (MTU − 40) |
| MSL (TCP) | ~60s (2× MSL) |