Skip to Content
Endpoints

Trainwave API Reference

The Trainwave REST API lets you programmatically control and monitor your machine learning jobs.

Quick Start

# List your jobs curl -H "Accept: application/json" \ -H "X-API-KEY: your-api-key" \ https://backend.trainwave.ai/api/v1/jobs/

Base URL

All API requests should be made to: https://backend.trainwave.ai/api/v1/

Authentication

See the Authentication Guide for details on generating and using API keys.

Response Format

All responses follow this standard format:

{ "success": true, "data": { // Response data here } }

Error responses:

{ "success": false, "error": { "code": "error_code", "message": "Human-readable error message" } }

Available Endpoints

Jobs

List Jobs

GET /api/v1/jobs/

Query parameters:

  • org (string): Organization ID
  • project (string): Project ID
  • status (string): Filter by status (running, completed, failed)
  • limit (integer, default: 20): Number of results per page
  • offset (integer): Pagination offset

Example request:

curl -H "X-API-KEY: your-api-key" \ "https://backend.trainwave.ai/api/v1/jobs/?project=p-abc123&status=running"

Example response:

{ "success": true, "data": { "count": 25, "next": "https://backend.trainwave.ai/api/v1/jobs/?offset=20", "previous": null, "results": [ { "id": "j-789xyz", "rid": "training-job-1", "created_at": "2024-03-21T09:00:00Z", "state": "RUNNING", "project": "p-def456", "owner": { "id": "u-abc123", "email": "[email protected]", "username": "johndoe" }, "total_cost": 25.5, "gpu_hours": 2.5, "metrics": { "gpu_utilization": 95.2, "memory_usage": 14.3 } } ] } }

Create Job

POST /api/v1/jobs/

Request body:

{ "name": "mnist-training", "project": "p-abc123", "description": "Training MNIST classifier", "gpu_type": "RTX A5000", "gpus": 1, "cpu_cores": 4, "memory_gb": 16, "hdd_size_mb": 51200, "image": "trainwave/pytorch:2.3.1", "setup_command": "pip install -r requirements.txt", "run_command": "python train.py", "env_vars": { "WANDB_API_KEY": "xxx", "PYTORCH_CUDA_ALLOC_CONF": "max_split_size_mb:512" }, "expires": "4h" }

Example using Python:

import requests headers = { "Accept": "application/json", "X-API-KEY": "your-api-key", "Content-Type": "application/json" } job_config = { "name": "mnist-training", "project": "p-abc123", "gpu_type": "RTX A5000", "gpus": 1, # ... other configuration } response = requests.post( "https://backend.trainwave.ai/api/v1/jobs/", headers=headers, json=job_config ) if response.status_code == 201: job = response.json()["data"] print(f"Created job: {job['id']}")

Get Job Details

GET /api/v1/jobs/{job_id}/

Example response:

{ "success": true, "data": { "id": "j-789xyz", "name": "mnist-training", "state": "RUNNING", "created_at": "2024-03-21T09:00:00Z", "started_at": "2024-03-21T09:01:00Z", "finished_at": null, "project": "p-abc123", "gpu_type": "RTX A5000", "gpus": 1, "cpu_cores": 4, "memory_gb": 16, "cost_per_hour": 2.5, "total_cost": 5.0, "metrics": { "gpu_utilization": 95.2, "memory_usage": 14.3 } } }

Stop Job

POST /api/v1/jobs/{job_id}/stop/
curl -X POST \ -H "X-API-KEY: your-api-key" \ https://backend.trainwave.ai/api/v1/jobs/j-789xyz/stop/

Projects

List Projects

GET /api/v1/projects/

Query parameters:

  • org (string): Organization ID
  • limit (integer, default: 20): Results per page
  • offset (integer): Pagination offset

Organizations

List Organizations

GET /api/v1/organizations/

Metrics

Get Job Metrics

GET /api/v1/metrics/{metric_name}/?job_id={job_id}

Available metrics:

  • cpu: CPU utilization
  • memory: Memory usage
  • network: Network I/O
  • gpu_utilization: GPU utilization
  • gpu_memory: GPU memory usage
  • disk: Disk I/O

Best Practices

  1. Error Handling — Always check the success field and handle errors gracefully
  2. Security — Never expose API keys in client-side code; use environment variables
  3. Monitoring — Log API response times and track error rates

Support

Last updated on