Phantom CodePhantom Code
Earn with UsBlogsHelp Center
Earn with UsBlogsMy WorkspaceFeedbackPricingHelp Center
Home/Blog/Operating Systems Interview Questions for Software Engineers
By PhantomCode Team·Published April 22, 2026·Last reviewed April 29, 2026·12 min read
TL;DR

Operating systems interviews at FAANG, HFT, and infrastructure companies probe processes vs threads, context switch cost, CPU scheduling (CFS, round-robin), virtual memory and paging, heap-stack-mmap layout, file systems (inodes, dentries, journaling), syscalls and the user-kernel boundary, signals, and zombie or orphan processes. Senior answers focus on real costs (TLB flushes on context switches, page fault latency) and runnable examples on Linux, not textbook definitions. Knowing why something is slow matters more than naming layers.

Operating Systems Interview Questions for Software Engineers

Operating systems questions separate candidates who have only used their laptop from candidates who know what actually happens when their program runs. At phantomcode.co we consistently see OS fundamentals come up in FAANG, HFT, and infrastructure interviews, because any non-trivial backend eventually collides with scheduling, memory, or filesystem behavior. This guide covers the topics interviewers actually probe, with the mental models and code you need to answer confidently.

This is not a textbook summary. Each section mirrors the way a senior interviewer actually digs: a direct question, a precise answer, a follow-up that trips most candidates, and a snippet you can run locally on a Linux box to validate your understanding.

Table of Contents

  1. Processes vs Threads
  2. Context Switches and Their Real Cost
  3. CPU Scheduling: Round-Robin, CFS, and Beyond
  4. Virtual Memory and Paging
  5. Memory Management: Heap, Stack, and mmap
  6. File Systems: Inodes, Dentries, and Journaling
  7. Syscalls and the User/Kernel Boundary
  8. Signals and Signal Handling Pitfalls
  9. Zombie and Orphan Processes
  10. Common Mistakes Candidates Make
  11. FAQ
  12. Conclusion

1. Processes vs Threads

Sample question: "Explain the difference between a process and a thread, and when you would prefer one over the other."

A process is an isolated address space with its own page tables, file descriptor table, and kernel accounting structure. A thread is a schedulable execution context that shares the address space of its parent process. On Linux there is no true thread abstraction in the kernel: both are task_struct, and threads are simply processes created with CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND.

Prefer processes when you need fault isolation (a crash should not poison sibling work), strong security boundaries (seccomp, namespaces, per-process credentials), or when you have CPU-bound work on a GIL-constrained runtime like CPython. Prefer threads when you need low-latency shared state, fast communication via shared memory, and when the cost of serialization across IPC would dominate.

// fork() gives a new address space; the child has its own heap.
pid_t pid = fork();
if (pid == 0) {
    // child: writes to global vars will NOT be seen by parent (copy-on-write).
    global_counter++;
    _exit(0);
}

Follow-up that trips candidates: "How much does fork actually copy?" Answer: on modern Linux it copies page tables, not pages. Physical pages are marked copy-on-write. The real cost of fork scales with the size of your page tables, which is why a 64 GB JVM can take hundreds of milliseconds to fork even when it writes nothing. This is exactly why Redis replaced fork-based BGSAVE with more incremental snapshotting on large instances.

2. Context Switches and Their Real Cost

Sample question: "Walk me through what happens on a context switch and where the cost comes from."

A context switch saves the register state of the running thread into its task_struct, flips the kernel stack pointer, updates the scheduler run queue, and restores the register state of the chosen next thread. If the new thread lives in a different address space, the kernel also reloads CR3 (on x86) to switch page tables.

The direct cost is small, usually 1 to 5 microseconds. The hidden cost is cache pollution. The new thread touches different code and data, so L1/L2 caches miss, TLB entries evict, and branch predictor state is effectively reset. On a memory-bound workload the effective cost can exceed 30 microseconds.

# Measure context switch rate on Linux.
vmstat 1 5
# 'cs' column is context switches per second.
 
# Per-process:
pidstat -w -p <PID> 1
# cswch/s (voluntary) vs nvcswch/s (involuntary).

A high nvcswch/s means the scheduler is preempting your thread before it yields. That usually indicates you are CPU-bound and oversubscribed. A high cswch/s means the thread is blocking, often on I/O or a lock.

Follow-up: "Why can a context switch be more expensive on a virtualized system?" Because you may also pay for a VM exit, where the hypervisor intercepts the CR3 write or the IPI used to reschedule a vCPU.

3. CPU Scheduling: Round-Robin, CFS, and Beyond

Sample question: "How does the Linux Completely Fair Scheduler work, and how does it differ from round-robin?"

Round-robin assigns every runnable task a fixed-length time slice and cycles through them. It is simple, predictable, and the basis for SCHED_RR in Linux real-time policies. Its weakness is that it treats every task equally; a burst-heavy task and a CPU hog share the same slice.

CFS (the default SCHED_OTHER policy until EEVDF replaced it in newer kernels) models an idealized multitasking CPU where every task runs at an equal fraction of CPU. It tracks vruntime per task, a weighted cumulative runtime where higher nice values make vruntime advance faster. The scheduler always picks the task with the smallest vruntime, stored in a red-black tree ordered by vruntime. Nice values map to weights, and a task's slice length depends on how many peers are runnable.

// Check and set scheduling policy.
#include <sched.h>
 
struct sched_param sp = { .sched_priority = 0 };
sched_setscheduler(0, SCHED_BATCH, &sp);
// SCHED_BATCH hints that this task is non-interactive.
// SCHED_IDLE runs only when nothing else wants the CPU.
// SCHED_FIFO / SCHED_RR are real-time and preempt SCHED_OTHER.

Follow-up: "What is EEVDF and why did Linux move to it?" Earliest Eligible Virtual Deadline First replaces CFS in kernel 6.6+. It adds a deadline per task, derived from requested latency, so interactive tasks explicitly get lower latency without hacks like sched_wakeup_granularity_ns. It is still fair in the long run but respects latency budgets in the short run.

4. Virtual Memory and Paging

Sample question: "Explain virtual memory end to end, from a pointer dereference to the data in DRAM."

Every process sees a private virtual address space. When a thread dereferences a pointer, the CPU splits the virtual address into a page number and an offset. The MMU consults the TLB (Translation Lookaside Buffer) first. On a hit, it forms the physical address in one cycle. On a miss, the page table walker reads up to four (on x86-64) or five (with LA57) levels of page tables from memory to produce a PTE. If the PTE is marked present and the access is allowed, the physical address is formed and the cache hierarchy supplies the data. If the PTE is not present, the CPU raises a page fault.

The kernel's page fault handler then does one of: allocate a zero page (anonymous first-touch), read a page from disk (file-backed or swap), fire a copy-on-write duplication, or deliver SIGSEGV.

# Observe page faults.
/usr/bin/time -v ./myprog
# Major (page faults): blocked on I/O to bring the page in.
# Minor (page faults): no I/O, just allocation or COW.
 
# Huge pages help reduce TLB pressure on large heaps.
cat /sys/kernel/mm/transparent_hugepage/enabled

Follow-up: "Why does a 100 GB malloc succeed instantly on Linux?" Because malloc calls mmap/brk which only reserves virtual memory. Pages are not allocated until you touch them. The kernel uses demand paging and overcommits by default, controlled via /proc/sys/vm/overcommit_memory. This is also how the OOM killer can surprise you: your process passed malloc but dies minutes later on first write.

5. Memory Management: Heap, Stack, and mmap

Sample question: "What are the differences between heap and mmap allocations, and when does glibc pick one over the other?"

glibc's malloc uses brk for small allocations (grows the heap linearly) and mmap for large ones (a threshold around 128 KB by default, tunable via M_MMAP_THRESHOLD). Heap-style allocations are cheap to reuse but suffer from fragmentation: a single long-lived allocation can anchor the heap top and prevent shrinking. mmap allocations are independent regions, unmapped cleanly on free, and good for large buffers.

The stack is a special region grown by the kernel's guard page mechanism. Each thread has its own fixed-size stack, default 8 MB on Linux, set via pthread_attr_setstacksize. Blowing the stack triggers SIGSEGV because the guard page below it is not readable.

#include <malloc.h>
// Force malloc to use mmap for anything above 64 KB.
mallopt(M_MMAP_THRESHOLD, 64 * 1024);
 
// Inspect current arena.
malloc_info(0, stdout);

Follow-up: "Why can RSS grow even though your program frees memory?" Because free returns memory to the allocator, not the kernel. Heap memory is only returned to the OS when the top of the heap is free. Use malloc_trim(0) or jemalloc with background_thread:true to encourage return.

6. File Systems: Inodes, Dentries, and Journaling

Sample question: "Describe how a filesystem resolves the path /var/log/syslog and what structures are involved."

The kernel starts at the root inode, which is pinned in memory. It looks up var in the root directory's data blocks, which are a list of (name, inode) pairs. It then loads the var inode, repeats for log, then syslog. Each lookup hits the dentry cache first, a per-filesystem hash table that memoizes path components to inodes. A complete path lookup without cache involves multiple disk reads and inode table lookups.

An inode holds metadata (size, mode, uid, gid, timestamps) and block pointers. A filename is not part of the inode; it lives in the parent directory's entries. This is why hard links are cheap: they are just additional directory entries pointing to the same inode.

Journaling protects metadata (or data, depending on mode) against crashes. ext4's default ordered mode writes data blocks first, then journals the metadata, then commits the metadata to its final location. After a crash, the journal is replayed, so metadata never points into garbage. data=journal mode journals everything but halves write bandwidth. data=writeback gives up ordering for speed.

# Inspect an inode directly.
stat /etc/hostname
ls -li /etc/hostname      # first column is the inode number.
 
# Check journal mode.
mount | grep " on / "
 
# See filesystem debug info on ext4.
sudo debugfs -R "stat <130023>" /dev/nvme0n1p2

Follow-up: "Why does copying 1 million small files take so much longer than one 1 GB file even on SSD?" Because each file involves at least two synchronous metadata operations (create, close) and individual inode updates, plus fsync barriers the application may issue. You are limited by metadata IOPS and journal commits, not by sequential bandwidth.

7. Syscalls and the User/Kernel Boundary

Sample question: "What happens on a syscall like read, and why does it cost more than a function call?"

User code issues syscall on x86-64, which traps into the kernel at the MSR-configured entry point. The CPU switches to kernel mode (CPL 0), loads a kernel stack, saves user registers, and dispatches through the syscall table. The kernel validates arguments (because pointers from userspace can be malicious or invalid), does the work, copies results back with copy_to_user (which handles page faults safely), and returns with sysret.

The cost is not the mode switch alone. It is: stack switch, register save/restore, argument validation, and now also the KPTI page table switch introduced after Meltdown. A no-op syscall on modern x86-64 with mitigations is roughly 200 to 700 ns.

// Measure syscall overhead.
#include <unistd.h>
for (int i = 0; i < 1000000; i++) {
    getppid();  // cheapest common syscall.
}

Follow-up: "How do vDSO and io_uring reduce this cost?" vDSO maps a small piece of kernel code into every process so trivially safe calls like clock_gettime do not need a real trap. io_uring uses two shared ring buffers (submission and completion) so userspace can queue thousands of I/O operations with zero or at most one syscall, and the kernel can process them in batches.

8. Signals and Signal Handling Pitfalls

Sample question: "Can you write a correct handler for SIGINT that flushes a log buffer?"

Probably not, if you have not done it before. Signal handlers run in the context of the interrupted thread. Almost nothing is async-signal-safe. You cannot call malloc, printf, or any function that touches a lock that the main thread might hold. The conventional pattern is to set a sig_atomic_t flag and let the main loop observe it.

#include <signal.h>
#include <stdatomic.h>
 
static volatile sig_atomic_t stop_requested = 0;
 
static void handle_sigint(int sig) {
    stop_requested = 1;   // async-signal-safe.
}
 
int main(void) {
    struct sigaction sa = { .sa_handler = handle_sigint };
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_RESTART;   // restart interrupted syscalls where possible.
    sigaction(SIGINT, &sa, NULL);
 
    while (!stop_requested) {
        do_work();
    }
    flush_logs();   // safe here, on the main thread.
}

Follow-up: "What does SA_RESTART not cover?" read on a socket with a timeout set via SO_RCVTIMEO will still return EINTR. poll, select, and epoll_wait are never restarted. If you rely on SA_RESTART you must still check for EINTR and handle it.

A better modern pattern is signalfd, which converts signals into readable file descriptors and integrates cleanly with an event loop:

sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
sigprocmask(SIG_BLOCK, &mask, NULL);
int sfd = signalfd(-1, &mask, SFD_CLOEXEC);
// Now read struct signalfd_siginfo from sfd inside your epoll loop.

9. Zombie and Orphan Processes

Sample question: "A long-running daemon is accumulating zombie children. What is happening and how do you fix it?"

A zombie is a terminated process whose parent has not yet called wait or waitpid. The kernel retains the exit status and accounting so the parent can retrieve them. Zombies hold a process table entry, not memory. If the parent never reaps, the entries accumulate and eventually you run out of PIDs.

Fixes in order of preference:

  1. Have the parent call waitpid(-1, ..., WNOHANG) in a loop on SIGCHLD.
  2. Set SIGCHLD to SA_NOCLDWAIT. The kernel will auto-reap with no zombie.
  3. Double-fork. The grandchild is re-parented to init (PID 1), which reaps it.
struct sigaction sa = { .sa_handler = SIG_DFL, .sa_flags = SA_NOCLDWAIT };
sigaction(SIGCHLD, &sa, NULL);
// Any children that exit are auto-reaped by the kernel.

An orphan is a live process whose parent has died. The kernel re-parents it to PID 1 (or a subreaper, if one was set via PR_SET_CHILD_SUBREAPER). Orphans are normal and harmless. They only become a problem if they were holding a process group and now run unsupervised.

Follow-up: "Why do containers need an init process?" Because in a PID namespace, PID 1 inherits all reaping. Running a single binary like python app.py as PID 1 means signals get special semantics (SIGTERM is ignored by default unless a handler is installed) and zombies have nowhere to go. tini or Docker's --init flag inserts a small init that reaps children and forwards signals.

10. Common Mistakes Candidates Make

These show up repeatedly in phantomcode.co mock interviews.

  • Saying processes "are slower" than threads without quantifying it. Fork is cheap on modern Linux. The real costs are IPC and duplicated working set.
  • Confusing physical memory with RSS. RSS is the resident portion of virtual memory; a mapped-but-not-touched region counts as zero RSS.
  • Believing volatile is sufficient for multithreaded synchronization. It is not; you need atomics or locks. volatile is useful for memory-mapped hardware and for signal handlers.
  • Forgetting that signal (the old API) has undefined portability for handler reinstallation and SA_RESTART. Use sigaction.
  • Explaining the page table walk without mentioning the TLB. Interviewers will stop you and ask where the TLB fits.
  • Saying "thread context switches are free within the same process." They are cheaper (no CR3 reload, no TLB flush in most cases) but not free. Cache effects still apply.
  • Describing CFS as round-robin. It is not; it is weighted fair queuing over vruntime.
  • Calling printf from a signal handler in a code sample. Instant red flag.

11. FAQ

How deep do OS questions go at L5 and above? At senior level you should be able to reason about lock contention under the scheduler, cache coherence (MESI), and trade-offs between copy-on-write and pre-zeroed allocation. At staff level, expect questions on NUMA placement, kernel bypass (DPDK, io_uring, SPDK), and the interaction between cgroups v2 and the scheduler.

Do I need to know x86 specifics? Knowing enough to discuss page table levels, CR3, and the TLB is useful. You do not need to memorize opcodes. ARM specifics (e.g., ASID-tagged TLBs) come up only at hardware-adjacent teams.

How should I practice? Run strace, perf, ftrace, and bpftrace against real programs. Watch what syscalls Redis, Postgres, or nginx make under load. Read a small Linux subsystem end to end: the signal code in kernel/signal.c is a great one-week project.

What resources are worth the time? Operating Systems: Three Easy Pieces (free), Robert Love's Linux Kernel Development, and the kernel's own Documentation tree. For practical drills, work through Julia Evans' zines and recreate her experiments.

Will interviewers ask me to write a scheduler? Rarely on a whiteboard. Occasionally on a take-home. More commonly they will show you a scheduler skeleton and ask you to reason about starvation, priority inversion, or fairness.

12. Conclusion

Strong OS answers are grounded in specifics: the actual struct, the actual syscall, the actual cost in nanoseconds. Memorizing definitions will get you past a phone screen but will collapse under follow-ups. The fastest way to internalize this material is to instrument a real program, break it, and watch the kernel's reaction in strace, perf, and dmesg.

If you can explain what happens when your program calls read with the same confidence you explain what happens when it calls a function, you will be ahead of the vast majority of candidates. For structured practice with live feedback on exactly these topics, the mock interview platform at phantomcode.co drills OS fundamentals with realistic follow-ups.

Frequently Asked Questions

What is the difference between a process and a thread?
A process is an isolated execution unit with its own virtual address space, file descriptor table, and resource accounting; threads are execution contexts that share the address space and most resources of their parent process. Threads are cheaper to create and switch between but lack the isolation of processes, so a misbehaving thread can corrupt the entire process. The right choice depends on isolation needs versus IPC overhead.
How does Linux's Completely Fair Scheduler (CFS) work?
CFS uses a red-black tree keyed by virtual runtime, picking the runnable task with the smallest vruntime to run next, weighted by nice value. The scheduler aims to give each task a fair share of CPU over a target latency window rather than fixed time slices. CFS replaced O(1) and is being supplemented or replaced by EEVDF in newer kernels for better latency guarantees.
What is virtual memory and why do operating systems use paging?
Virtual memory gives each process its own contiguous address space mapped to physical memory through page tables, enabling isolation, swap, lazy allocation, and demand paging. Paging breaks memory into fixed-size pages (typically 4KB) so the OS can move them between RAM and disk transparently and use copy-on-write for fork. Page faults are the cost: a TLB miss followed by a disk fetch is orders of magnitude slower than RAM access.
What is a context switch and what makes it expensive?
A context switch is when the CPU saves the state of one thread or process and restores another, including registers, stack pointer, and program counter. The direct cost is small (microseconds) but the indirect cost is large: the new task arrives with cold caches, a flushed TLB on cross-process switches, and pipeline stalls. High context switch rates (visible in vmstat or perf) usually indicate scheduler thrash or lock contention.
What is a zombie process and how do you prevent one?
A zombie is a process that has exited but still has an entry in the process table because its parent has not called wait() to read its exit status. Zombies hold a PID slot and accumulate if the parent ignores SIGCHLD. Prevent them by calling waitpid in a SIGCHLD handler, double-forking and detaching, or setting SIGCHLD to SIG_IGN on Linux which automatically reaps children.

Ready to Ace Your Next Interview?

Phantom Code provides real-time AI assistance during technical interviews. Solve DSA problems, system design questions, and more with instant AI-generated solutions.

Get Started

Related Articles

10 Things Great Candidates Do Differently in Technical Interviews

Ten behaviors that separate offer-winning candidates from average ones, from clarifying questions to optimizing without being asked.

From 5 Rejections to a Google Offer: One Engineer's Story

How a mid-level engineer turned five Google rejections into an L5 offer by fixing communication, system design depth, and exceptional reasoning.

Advanced SQL Interview Questions for Senior Engineers (2026)

Basic SQL gets you through L3. Senior roles require window functions, CTEs, execution plans, and real optimization know-how. Here is the complete advanced playbook.

Salary Guide|Resume Templates|LeetCode Solutions|FAQ|All Blog Posts
Phantom CodePhantom Code
Phantom Code is an undetectable desktop application to help you pass your Leetcode interviews.
All systems online

Legal

Refund PolicyTerms of ServiceCancellation PolicyPrivacy Policy

Pages

Contact SupportHelp CenterFAQBlogPricingBest AI Interview Assistants 2026FeedbackLeetcode ProblemsLoginCreate Account

Compare

Interview Coder AlternativeFinal Round AI AlternativeUltraCode AI AlternativeParakeet AI AlternativeAI Apply AlternativeCoderRank AlternativeInterviewing.io AlternativeShadeCoder Alternative

Resources

Salary GuideResume TemplatesWhat Is PhantomCodeIs PhantomCode Detectable?Use PhantomCode in HackerRankvs LeetCode PremiumIndia Pricing (INR)

Interview Types

Coding InterviewSystem Design InterviewDSA InterviewLeetCode InterviewAlgorithms InterviewData Structure InterviewSQL InterviewOnline Assessment

© 2026 Phantom Code. All rights reserved.