Learning Roadmap

This guide helps you navigate the architecture reports in this repository based on your experience level and learning goals. Each project link leads to a detailed architecture analysis covering design decisions, data flow, and key takeaways.

How to Use This Roadmap

Start with the difficulty level that matches your background. Within each level, pick a project in a domain you care about. Read the architecture report, paying special attention to the “Key Design Decisions” and “Key Takeaways” sections. Then move on to the next level or explore a themed learning path.


Beginner

Projects with focused scope, clean codebases, and straightforward architectures. Great for learning fundamental patterns like event loops, plugin systems, and middleware chains.

Project Why Start Here
fzf Compact Go codebase demonstrating event-driven concurrency and pipeline architecture in a single-purpose CLI tool
starship Clean modular Rust design with parallel execution, lazy evaluation, and a well-defined plugin interface
gin Minimal web framework showing radix tree routing, middleware chains, and object pooling in under 10K lines of Go
hugo Static site generator with clear separation of content processing, templating, and asset pipelines
caddy Modular web server with an elegant module registry, lifecycle hooks, and automatic HTTPS by default
coredns DNS server built on a plugin chain architecture forked from Caddy, showing how to adapt a framework for a new domain
fastapi Python web framework demonstrating dependency injection, type-driven validation, and async request handling
esbuild JavaScript bundler written in Go with a focus on raw performance through parallelism and minimal allocations
ruff Python linter in Rust showing how to apply compiler techniques (parsing, AST traversal) to developer tooling
vite Frontend build tool demonstrating native ES module serving and on-demand compilation for fast development cycles
excalidraw Hand-drawn style diagramming tool with a simple React canvas architecture and collaborative editing

Intermediate

Projects with multiple interacting subsystems, distributed components, or sophisticated internal data structures. Good for learning about storage engines, query languages, networking stacks, and orchestration patterns.

Project What You Will Learn
redis Single-threaded event loop design, in-memory data structures, persistence strategies (RDB/AOF), and replication
prometheus Pull-based monitoring architecture, custom TSDB with LSM-like compaction, PromQL query engine, and service discovery
envoy L4/L7 proxy with filter chain architecture, xDS dynamic configuration, and hot-restart for zero-downtime upgrades
react Fiber-based reconciliation engine, concurrent rendering with priority scheduling, and the virtual DOM diffing algorithm
containerd Container runtime with gRPC API, shim-based process isolation, snapshot-based storage, and CRI plugin integration
etcd Distributed key-value store using Raft consensus, MVCC storage, and watch notification streams
grafana Observability platform with plugin architecture, data source abstraction, and dashboard rendering pipeline
traefik Cloud-native reverse proxy with automatic service discovery, dynamic configuration, and middleware chains
helm Kubernetes package manager demonstrating Go templating, release lifecycle management, and chart dependency resolution
clickhouse Column-oriented OLAP database with vectorized query execution, MergeTree storage engine, and distributed query planning
alacritty GPU-accelerated terminal emulator with VTE parser, PTY integration, and a modular rendering pipeline
backstage Developer portal with plugin-based architecture, software catalog model, and frontend/backend separation
bevy ECS-based game engine with parallel system scheduling, modular plugin architecture, and Rust compile-time guarantees
buf Protobuf toolchain with multi-layer architecture spanning CLI, compiler, linter, and schema registry integration
dapr Microservice sidecar runtime with building block abstractions, pluggable components, and control plane integration
duckdb In-process analytical database with vectorized execution engine, cost-based query optimizer, and columnar storage
falco Runtime security tool using kernel-level event capture via eBPF, rule engine, and event enrichment pipeline
gitea Self-hosted Git service with layered architecture, multi-database support, and Gitea Actions integration
harbor Container registry with microservice architecture, replication engine, scanner integration, and policy management
helix Modal text editor with Tree-sitter integration, LSP client, and ropey-based rope data structure for text manipulation
leptos Fine-grained reactive Rust web framework with isomorphic server functions and SSR/CSR support
nats-server High-performance messaging system with JetStream persistence, Raft consensus, and multi-cluster support
nestjs Enterprise Node.js framework with dependency injection container, modular architecture, and decorator-based metadata
nushell Structured data shell with custom type system, pipeline engine, and plugin architecture for object-based pipelines
remix Web standards-based framework with composable package architecture and fetch-based routing
rook Kubernetes operator for Ceph storage with custom controllers for monitors, OSDs, and CSI driver integration
solid Fine-grained reactive framework with compile-time JSX transformation and direct DOM manipulation without virtual DOM
spring-boot Auto-configuration framework with conditional bean registration, embedded server support, and plugin-based build tooling
supabase Postgres platform integrating multiple open-source services via Kong gateway with unified auth and storage layers
tauri Cross-platform desktop app framework with strict IPC separation, runtime abstraction, and ACL security layer
trivy Security scanner with modular artifact analysis, multiple scanner types, and unified detection pipeline
turborepo Monorepo build system with task graph execution, remote caching, and lockfile-based dependency analysis
typst Incremental typesetting system with four-phase compilation pipeline and comemo memoization framework
wezterm GPU-accelerated terminal emulator with HarfBuzz font shaping, multiplexing layer, and Lua configuration
zellij Terminal multiplexer with multi-threaded architecture, WebAssembly plugin system, and Unix socket IPC

Advanced

Large-scale distributed systems with complex consensus protocols, multi-layer architectures, and sophisticated failure handling. For developers ready to study production-grade distributed systems design.

Project Core Complexity
kubernetes Declarative desired-state reconciliation, controller pattern, custom resource extensions, and multi-component distributed architecture
kafka Distributed commit log with partition-based parallelism, ISR replication, exactly-once semantics, and the KRaft consensus layer
cockroachdb Distributed SQL with Raft-based replication, MVCC transactions, range-based sharding, and geo-partitioning
spark DAG-based execution engine with Catalyst query optimizer, Tungsten memory management, and adaptive query execution
elasticsearch Distributed search with Lucene-based inverted indices, shard allocation, near-real-time indexing, and a rich query DSL
tidb MySQL-compatible distributed database with TiKV storage layer, Raft groups, and cost-based query optimization
vitess MySQL horizontal scaling through vtgate query routing, vttablet connection pooling, and topology-aware sharding
istio Service mesh control plane with Envoy data plane integration, mTLS certificate management, and traffic policy enforcement
ray Distributed computing framework with task and actor abstractions, GCS-based fault tolerance, and object store for data sharing
temporal Durable execution platform with event-sourced workflow history, deterministic replay, and multi-cluster replication
airflow Distributed workflow orchestration with DAG scheduling, pluggable executors, and multi-component architecture
angular Full-featured frontend framework with Ivy rendering engine, hierarchical DI, Signals reactivity, and AOT/JIT compilation
bun Unified JavaScript runtime, bundler, transpiler, test runner, and package manager implemented in Zig with JavaScriptCore
consul Distributed service networking with Raft consensus, Serf gossip protocol, and service mesh capabilities
cortex Horizontally scalable Prometheus long-term storage with multi-tenancy, distributed components, and erasure coding
deno Secure JavaScript/TypeScript runtime in Rust with V8, TypeScript integration, and Node.js compatibility layer
dgraph Distributed GraphQL database with Raft consensus, predicate-based sharding, and BadgerDB storage engine
go-ethereum Ethereum protocol implementation with EVM, Merkle Patricia Trie state management, and multiple sync modes
influxdb Time-series database with FDAP stack, WAL, Parquet persistence, and DataFusion query engine integration
keycloak IAM solution with multi-protocol support (OIDC/SAML/OAuth2), SPI-driven architecture, and Infinispan distributed caching
minio S3-compatible object storage with erasure coding, distributed architecture, and multi-pool management
neovim Extensible editor with libuv event loop, MessagePack-RPC API, embedded Lua runtime, and built-in LSP/Tree-sitter
nomad Workload orchestrator with Raft consensus, multiple scheduler types, pluggable task drivers, and multi-region federation
podman Daemonless container management with multi-process architecture coordinating conmon, OCI runtimes, and network backends
polars Query engine with lazy evaluation, query optimizer, streaming execution, and Apache Arrow columnar processing
pulsar Distributed pub-sub with compute-storage separation, BookKeeper persistence, namespace multi-tenancy, and geo-replication
qdrant Distributed vector database with HNSW indexing, multi-quantization, Raft consensus, and shard management
quarkus Build-time optimized Java framework with augmentation pipeline, Arc CDI, and bytecode recording system
rocksdb LSM-tree storage engine with complex compaction strategies, MVCC, multi-threaded background operations, and pluggable memtables
rust-analyzer Incremental compiler infrastructure with Salsa query system, HIR layers, and IDE feature implementations
swc Multi-phase JavaScript/TypeScript compiler with lexer/parser/transform/codegen pipeline and WebAssembly plugin system
terraform Infrastructure-as-code with DAG execution engine, gRPC provider plugin system, and state management with locking
tikv Distributed transactional KV store with Multi-Raft consensus, Percolator 2PC, MVCC storage, and coprocessor push-down
tokio Async runtime with work-stealing scheduler, reactor pattern I/O driver, hierarchical timer wheel, and task orchestration
vault Secrets management with encryption barrier, pluggable auth/secrets engines, expiration manager, and Raft storage
wasmtime WebAssembly runtime with Cranelift JIT compiler, guard-page memory management, WASI implementation, and component model
weaviate Vector database with HNSW indexing, Raft consensus for schema, inverted indexes, and integrated ML model support
zed GPU-accelerated editor with custom GPUI framework, CRDT-based collaboration, and display-map rendering pipeline

Learning Paths

Themed sequences of projects that build on each other. Each path follows a progression from simpler foundational components to more complex systems that depend on or extend them.

Container Ecosystem

containerd -> podman -> moby -> compose -> kubernetes -> helm

Follow the container stack from runtime to orchestration. Start with containerd to understand how containers actually run (shims, snapshots, CRI). Compare with Podman for the daemonless, rootless-first alternative using conmon. Move to moby to see how Docker wraps containerd with image building and networking. Then compose shows multi-container application definition. Kubernetes introduces distributed orchestration on top of container runtimes. Finally, helm adds package management for Kubernetes applications.

Observability Stack

prometheus -> grafana -> loki -> opentelemetry-collector -> thanos

Build understanding of modern observability layer by layer. Prometheus teaches metrics collection, TSDB internals, and PromQL. Grafana shows how to build a visualization platform with pluggable data sources. Loki applies Prometheus-like label indexing to log aggregation. OpenTelemetry Collector demonstrates a vendor-neutral telemetry pipeline with receivers, processors, and exporters. Thanos extends Prometheus with long-term storage, global querying, and downsampling.

Data Infrastructure

redis -> kafka -> spark -> flink -> arrow

Progress from single-node data storage to distributed streaming and analytics. Redis covers in-memory data structures and event-driven I/O. Kafka introduces distributed commit logs and partition-based parallelism. Spark shows batch and micro-batch processing with DAG execution and query optimization. Flink adds true stream processing with exactly-once guarantees and watermark-based event time handling. Arrow provides the columnar memory format that enables zero-copy data exchange between all these systems.

Web Development

react -> nextjs -> vite -> svelte

Explore different approaches to building web applications. React introduces the component model, virtual DOM, and concurrent rendering. Next.js builds on React with server-side rendering, routing, and the App Router architecture. Vite takes a different angle by focusing on build tooling with native ES modules and on-demand compilation. Svelte challenges the virtual DOM approach entirely with compile-time reactivity and zero-runtime overhead.

Cloud-Native Networking

coredns -> envoy -> istio -> cilium -> linkerd2 -> consul

Trace the networking stack from DNS to service mesh. CoreDNS handles service discovery via plugin-chained DNS resolution. Envoy provides the programmable L4/L7 proxy with filter chains and xDS configuration. Istio builds a control plane on top of Envoy for traffic management, security, and observability. Cilium uses eBPF for kernel-level networking, security, and observability without sidecar proxies. Linkerd2 offers a lightweight alternative service mesh focused on simplicity and Rust-based micro-proxies. Consul combines service discovery, health checking, and service mesh capabilities with Raft consensus and gossip protocol.

Infrastructure as Code

terraform -> opentofu -> crossplane -> dagger -> flux2 -> argo-cd

Learn how modern tools manage infrastructure declaratively. Terraform introduces the foundational plan/apply workflow with DAG execution and gRPC provider plugins. OpenTofu is the community fork adding state encryption while maintaining compatibility. Crossplane brings infrastructure management into Kubernetes via custom resources and composition. Dagger applies containerized pipelines to CI/CD with a programmable API. Flux2 implements GitOps for Kubernetes with source tracking, Kustomize, and Helm integration. Argo CD provides a full GitOps continuous delivery platform with application sync, health monitoring, and rollback.

Security and Secrets

trivy -> falco -> vault -> keycloak

Learn how security is implemented at different layers. Trivy provides vulnerability scanning with modular analyzers for containers, filesystems, and code. Falco monitors runtime security using kernel-level event capture via eBPF and a rule engine. Vault manages secrets with an encryption barrier, pluggable auth/secrets engines, and Raft storage. Keycloak handles identity and access management with multi-protocol support and SPI-driven extensibility.


This site uses Just the Docs, a documentation theme for Jekyll.