Google I/O

WebAssembly and WebGPU enhancements for faster Web AI

Running AI inference directly on client machines reduces latency, improves privacy by keeping all data on the client, and saves server costs. To accelerate these workloads, WebAssembly and WebGPU are evolving to incorporate new low-level primitives. Learn how these new additions unlock fast hardware capabilities to significantly speed up AI inference and enable highly tuned inference libraries and frameworks to efficiently run large AI models.

Intermediate
Technical session
Join a community group

Meet developers, discover local groups, and build your global network.

Get started
Continue learning

Grow your skills around the Google technology you love.

Get started