Kinetic: Run ML workloads on cloud TPUs and GPUs

Kinetic: Run ML workloads on cloud TPUs and GPUs#

Run any Python function on a cloud TPU or GPU with one decorator. No infrastructure to wire up, no images to build by hand, no multi-host boilerplate.

import kinetic

@kinetic.run(accelerator="tpu-v6e-8")
def train_model():
    import keras
    model = keras.Sequential([...])
    model.fit(x_train, y_train)
    return model.history.history["loss"][-1]

final_loss = train_model()  # runs on a TPU v6e-8 slice

Start here#

Three entry points cover what most new users need first:

Your first run

Long-running jobs

Data and checkpoints

Install, point at a cluster, and run a real Keras job in minutes. Getting Started.

Switch from blocking run() to detached run_async() for jobs that take hours. Detached Jobs.

Ship local files in, write durable artifacts back out via KINETIC_OUTPUT_DIR. Data and Checkpointing.

How Kinetic works#

Five short phases on every job:

  1. Discover. Your function, working directory, and Data(...) arguments are captured. requirements.txt or pyproject.toml is read.

  2. Build or fetch. A container image is produced — built with your dependencies (bundled mode) or pulled from a published base (prebuilt mode). See Execution Modes.

  3. Schedule. A Kubernetes resource (a Job for single-host workloads, a LeaderWorkerSet for multi-host TPU jobs on the Pathways backend) is submitted to your GKE cluster. The autoscaler provisions accelerator nodes if needed.

  4. Run. Your function executes inside the pod with KINETIC_OUTPUT_DIR set; logs stream back to your terminal.

  5. Collect. The return value is serialized to GCS and pulled back to your local process. @kinetic.run() cleans up the pod and GCS artifacts as soon as the result is collected. run_async() leaves the pod running until you call .result() or .cleanup() on the returned JobHandle — important to remember on expensive accelerators.

Choose your execution mode#

Three modes control how dependencies get into the container:

  • Bundled (default) — Kinetic builds a custom image with your deps baked in. Best for stable workflows and reproducible runs.

  • Prebuilt — pulls a published base image, installs your deps at pod startup. Best for fast iteration when deps change often.

  • Custom image — bring your own image URI. Best when you need custom system libraries or a corporate-vetted base.

See Execution Modes for the full recommendation matrix and per-mode startup expectations.