ProfilePersonCanonical byline target

David Gornshtein

Founder

David Gornshtein leads the technical direction behind MegaCpp and the public engineering narrative around SLM Ensemble.

C++ systems

Model infrastructure

Technical writing

Product direction

Primary byline: David Gornshtein
Canonical URL: /people/david-gornshtein/
Identity links: 2

GitHub LinkedIn Email

This page is the canonical author identity target used by MegaCpp article bylines and `BlogPosting.author.url` references.

Related discovery

Articles attributed to David Gornshtein

These links are derived from the local blog index and use the same author-name mapping as MegaCpp bylines and article schema.

Published April 19, 2026

Torch/XLA 2.11 expectations vs TPU reality

What MegaCpp expected from the Torch/XLA 2.11 line on TPU, what the shipped stack actually looked like in practice, and how that changed our bringup strategy.

Read article

Published April 19, 2026

vLLM on GB10: the overlay, the registration fixes, and the paths we kept off

How MegaCpp stabilized a GB10-oriented vLLM lane with an on-disk overlay, text-only model registration, and a deliberate keep-disabled list for serving paths that were not yet honest.

Read article

Published April 18, 2026

Attention Validity and Structure-Aware Attention

A packed-row validity regression, the clustered-sparse follow-up it forced, and the structure-aware attention plan we are integrating into the MegaCpp training stack.

Read article

Published April 18, 2026

Dynamo and torch.compile Breakage on a Mamba-3 Hybrid

Graph breaks, recompile storms, guard explosions, and cache-hygiene rules we landed while keeping torch.compile useful on MegaCpp's hybrid Mamba-3 + Transformer stack.

Read article

Published April 18, 2026

External library glitches we fixed

A catalog of upstream bugs we hit while training our hybrid Mamba-3 plus DSA recipe, grouped by library: what broke, what we patched locally, and what we prepared upstream.

Read article

Published April 18, 2026

Flash Attention 4 in practice: what we shipped and what we cut

Our hybrid stack's applicability matrix for Flash Attention 4, the validation profiles, the dense-full rollout gates, and the regressions that killed the first FA4 variants before they reached deployment.

Read article