MegaCpp Engineering
Engineering Team
MegaCpp Engineering is the public engineering and editorial team behind the site's implementation notes, benchmarking receipts, and rollout write-ups.
- Primary byline
- MegaCpp Engineering
- Canonical URL
- /people/megacpp-engineering/
- Also credited as
- Engineering Team, Engineering
- Identity links
- 0
Articles attributed to MegaCpp Engineering
These links are derived from the local blog index and use the same author-name mapping as MegaCpp bylines and article schema.
Context Parallel and Sequence Parallel: Similar Names, Different Jobs
An explanation of SP versus CP using TP-aware helpers, long-context bring-up patterns, and hybrid model design.
Distributed Optimizer Stress: Drift, All-Gather vs Reduce-Scatter, and Muon Gotchas
EP, PP, TP, CP, SP, DP: The Parallelism Map We Actually Use
What data, tensor, sequence, context, pipeline, and expert parallelism each own, how they compose, and where the real integration risks still live.
Expert Parallel and MoE Sharding: Capacity Is Cheap, Routing Is Not
A grounded walkthrough of expert parallelism in the MegaCpp stack, based on the recipe files, layer definitions, schedule plans, and bug reports that shape how MoE runs actually behave.
FSDP2 pain and payoff: what actually reduced memory
A practical look at selective wrapping, reshard timing, mixed precision, and the interaction between sharding, pipeline boundaries, and heterogeneous model blocks.
H200 Bringup and Naming: What Had to Be Made Explicit
A code- and doc-grounded look at H200 bringup, why naming mattered, how a flagship hybrid recipe was encoded across launch surfaces, and which infrastructure assumptions had to be turned into explicit contracts.