Entity Hub
TPU Sparse Attention and Pallas Kernels
A curated TPU sparse-attention reading path: block-sparse contracts, Pallas kernel choices, SPMD sharding, and the runtime surfaces that keep long-context TPU work stable.
This hub is narrower than the general TPU/XLA archive. Start with the block-sparse and Pallas kernel notes, then move into sharding and planner surfaces, and finally the operational pieces that keep the TPU sparse-attention lane observable and stable.
sparse-attention
pallas
flash-attention
softcap
doc-masking
block-sparse
Curated set
9
Articles in reading order
Best if you care specifically about sparse attention, Pallas, and long-context TPU kernel work rather than the TPU stack as a whole.