Meet the Eight: Inside the MegaCpp Specialist Ensemble
Profiles of the eight specialist SLMs in the MegaCpp ensemble — what each one is good at, what it was trained on, and what to never ask it.

The scaling hypothesis is dead in our corner of the world. Instead of one 70B generalist that knows a little about everything, we run eight 4B-8B specialists that each know one slice of C++ engineering deeply. Every specialist is a sparse MoE model: 4B-8B total parameters, 0.8B-1.6B active per token (~10% activation ratio), NVFP4 at inference, trained on 100-200B tokens drawn from our curriculum-mapped C++ corpus (v2_simple through v6_enriched with structure-aware parquet metadata — AST node types, call edges, type edges, chunk boundaries). Each specialist is trained through the same four-phase curriculum (4K syntax, 16K file-local, 64K repo graph, structure-aware) but on a domain-skewed data mix. Below: what each one does, what it is not for, and the data that made it.
1. Algo-SLM — Algorithms and Data Structures
Best at: Pseudocode-to-C++ translation, complexity analysis, choosing the right container, implementing classic algorithms (graphs, DP, string algorithms, numerical routines) in idiomatic modern C++.
Training mix: Heavy weight on CP-style repositories, absl/algorithm, Boost.Graph, EASTL, numeric libraries (Eigen cores, GSL), LeetCode-shaped reference solutions, and algorithm textbook transcriptions. Phase 1 (v3_simple 4K) dominates early training — tight self-contained functions with clear pre/post conditions. Phase 2 adds v4_context_graph shards that pair algorithms with their callers so the model learns when to reach for a flat_hash_map vs. btree_map.
Not for: Build systems, linker errors, platform-specific syscalls, long-running async pipelines, or anything where the answer depends on repository-level state beyond ~16K tokens. It will happily hand you a beautiful Dijkstra that ignores your existing graph abstraction. Route architectural decisions to Design-SLM and repo-scale changes to the Orchestration-SLM.
Size: 7B total / 1.4B active.
2. Template-SLM — Templates and Metaprogramming
Best at: SFINAE, concepts, CRTP, variadic templates, if constexpr ladders, expression templates, tag dispatch, and reading the kind of compiler error that starts with note: candidate template ignored and continues for three screens.
Training mix: Boost (MPL, Hana, Fusion, Mp11), range-v3, fmt, spdlog internals, Eigen expression templates, Abseil type traits, libc++/libstdc++ headers, and every C++20/23 concepts-heavy library we could index. Structure-aware Phase 4 is critical here: the structure_ids column lets the model distinguish class_decl from func_body from typedef, which matters enormously when reasoning about where a requires-clause legally lives.
Not for: Runtime performance tuning, I/O, system calls, or "just make it work" production patches. Template-SLM will refactor your one-line fix into a five-concept constrained template because that is what its world looks like. It is also not a linker — symbol visibility and ODR issues route to Build-SLM.
Size: 8B total / 1.6B active.
3. Memory-SLM — RAII, Allocators, Smart Pointers
Best at: Ownership modeling, lifetime analysis, custom allocators, arena/pool designs, unique_ptr/shared_ptr/weak_ptr tradeoffs, move semantics, and catching dangling references and aliasing bugs before they reach ASan.
Training mix: EASTL allocators, Abseil memory internals, folly Arena/F14, mimalloc, jemalloc, tcmalloc, Chromium base/memory, LLVM BumpPtrAllocator, and kernel slab code. We deliberately over-sample diffs that change ownership (constructor/destructor edits, std::move insertions) using the v4_context_graph packing so the model sees the caller of a moved-from object in the same window as the move itself.
Not for: Algorithmic correctness, template metaprogramming edge cases, or build configuration. It will propose an arena allocator before asking whether the hot path is even allocation-bound. Pair it with Algo-SLM for complexity reasoning and Debug-SLM for actual leak traces.
Size: 7B total / 1.4B active.
4. Concurrency-SLM — Parallelism and Synchronization
Best at: std::atomic memory orderings, lock-free queues, thread pools, coroutines (co_await/co_return), executors, std::jthread/stop tokens, TBB, OpenMP pragmas, and the tricky business of not introducing data races while fixing a data race.
Training mix: folly (MPMCQueue, ProducerConsumerQueue, Futures), TBB, Intel OneAPI, libcds, Seastar, Boost.Asio, Abseil synchronization primitives, and curated ThreadSanitizer reports paired with their fixes. Phase 3 (64K repo graph) matters here because races almost always live across files — the model needs to see the producer and consumer in the same window.
Not for: GPU kernels, SIMD micro-optimization, distributed systems, or algorithmic design. Concurrency-SLM thinks in terms of happens-before; it does not think in terms of cache lines unless you explicitly ask. Distributed coordination, Raft, gossip protocols — those are out of scope; route to a human.
Size: 8B total / 1.6B active.
5. Systems-SLM — Low-Level, OS, Syscalls
Best at: POSIX and Win32 syscalls, epoll/kqueue/io_uring, signal handling, /proc inspection, ELF/Mach-O layout, dynamic linking, page tables conceptually, and kernel-module-adjacent userspace code. Knows why your fork() plus threads just corrupted state.
Training mix: Linux kernel selftests and userspace helpers, liburing, musl, glibc, FreeBSD libc, LLVM libunwind, Chromium sandbox code, gVisor, DPDK, and strace/ltrace/perf output paired with the code that produced it. We bias heavily toward Phase 3 context because syscalls only make sense with their surrounding control flow and error handling.
Not for: Templates, higher-level architecture, or anything where the answer is "use a library". Systems-SLM will reach for mmap when std::vector is fine. It is also not a security reviewer — it knows how syscalls work, not whether your use of them is safe against an adversary.
Size: 8B total / 1.6B active.
6. Build-SLM — CMake, Bazel, Clang Tooling
Best at: CMakeLists.txt authoring, target-based design, FetchContent/find_package, toolchain files, cross-compilation, Bazel BUILD files, rules_cc, Ninja, clang-tidy configuration, sanitizer flags, and decoding undefined reference and multiple definition errors.
Training mix: Large-scale open-source build trees (LLVM, Chromium BUILD.gn, Abseil, gRPC, Envoy, Bazel's own repo), vcpkg/Conan recipes, compile_commands.json examples, and paired before/after diffs of build-system refactors. The preamble and namespace structure categories in v6_enriched give it a strong prior on include and visibility layout.
Not for: Actual C++ logic, algorithms, or runtime behavior. Build-SLM reasons about what compiles and links, not what runs correctly. Do not ask it to design an API. Do not ask it to optimize a hot loop. It will, however, gladly tell you why your template-heavy header just blew out your compile cache.
Size: 6B total / 1.2B active.
7. Debug-SLM — GDB, Sanitizers, Ground-Truth Integration
Best at: Reading stack traces, interpreting ASan/UBSan/TSan/MSan output, writing GDB Python scripts, navigating core dumps, bisecting regressions, and — most importantly — grounding its answers in live debugger state rather than hallucinating a variable's value.
Training mix: Curated GDB/LLDB sessions with annotated transcripts, sanitizer reports paired with their root-cause patches, LLVM compiler-rt internals, rr replay traces, kernel BUG: reports, and Valgrind logs. Debug-SLM is the only specialist with a ground-truth tool channel: at inference time it can query a live GDB/LLDB bridge for actual register, memory, and backtrace values, and its training includes the tool-use trajectories that teach it to ask before guessing.
Not for: Green-field implementation, design, or algorithmic work. Debug-SLM assumes something is already broken. Hand it working code and it will invent a bug to diagnose. Use it strictly as a reactive, evidence-driven specialist.
Size: 7B total / 1.4B active.
8. STL-SLM — Standard Library Fluency
Best at: Picking the right standard container and algorithm, idiomatic <ranges>, <algorithm>, <numeric>, iterator categories, <chrono>, <filesystem>, <format>, <expected>, std::string_view vs. std::string tradeoffs, and knowing which <execution> policy is safe.
Training mix: libc++ and libstdc++ source (both implementation and test suites), MSVC STL where licensing allows, cppreference-derived examples, Abseil's STL-compatible containers, range-v3, and a curated stream of "replace raw loop with algorithm" diffs. Structure-aware training is especially valuable here: the model learns that a func_body consisting of a raw for-loop over v.begin()/v.end() almost always has a <ranges> or <algorithm> rewrite.
Not for: Third-party libraries, custom allocator design, template metaprogramming beyond what the standard requires, or concurrency primitives beyond std::atomic basics. STL-SLM is deliberately narrow. When the answer is "reach for Boost" or "write a custom container", route elsewhere.
Size: 6B total / 1.2B active.
Why eight, and why these eight
Each specialist is cheap enough to keep resident alongside its peers — the full ensemble fits in roughly 32 GB of NVFP4 VRAM, less than a single 70B generalist in FP16. The Orchestrator (documented separately) routes each request to one or more specialists based on the structural signature of the prompt: template-heavy tokens bias toward Template-SLM, stack traces bias toward Debug-SLM, CMakeLists.txt tokens bias toward Build-SLM, and so on. Specialists disagree often, and that disagreement is the signal — when Algo-SLM and Memory-SLM both weigh in on a hot-path container choice, the ensemble's answer is almost always better than either alone. That is the whole bet: narrow models, wide coverage, honest handoffs.