Fine-tuning LLMs#
Kinetic integrates seamlessly with Keras Hub and the Kaggle ecosystem, making it easy to fine-tune large language models like Gemma on cloud TPUs.
Capturing Credentials#
When fine-tuning models from Keras Hub or Kaggle, you often need to provide credentials (KAGGLE_USERNAME, KAGGLE_KEY). Use the capture_env_vars parameter to securely forward your local environment variables to the remote worker.
import kinetic
@kinetic.run(
accelerator="tpu-v5litepod-1",
capture_env_vars=["KAGGLE_*", "GOOGLE_CLOUD_*"]
)
def train_gemma():
import keras_hub
# Credentials are automatically available in the remote environment
gemma_lm = keras_hub.models.Gemma3CausalLM.from_preset("gemma3_1b")
# ...
Low-Rank Adaptation (LoRA)#
Fine-tuning large models often requires massive memory. LoRA significantly reduces the number of trainable parameters, enabling fine-tuning on smaller accelerator slices.
@kinetic.run(accelerator="tpu-v5litepod-8")
def train_lora():
import keras_hub
gemma_lm = keras_hub.models.GemmaCausalLM.from_preset("gemma_2b_en")
# Enable LoRA (rank=4)
print("Enabling LoRA...")
gemma_lm.backbone.enable_lora(rank=4)
# Train as usual
gemma_lm.fit(train_data, epochs=3)
return "Training complete!"
Distributed Fine-tuning#
For larger models or datasets, use the Pathways backend to distribute training across multiple TPU hosts.
@kinetic.run(
accelerator="tpu-v6e-8",
backend="pathways"
)
def train_distributed():
import keras
import jax
# Multi-host TPU environment is auto-initialized
# ...
See the Distributed Training guide for more details on scaling your workloads.