Decentralized GPU network (DePIN) that aggregates idle GPUs from data centers, miners, and consumer rigs into Ray-based clusters. IO Cloud spins up multi-node H100/A100 clusters in minutes; IO Intelligence exposes OpenAI-compatible inference and an Agent Cloud MCP server for tool-using agents.
- 01on-demand GPU clusters
- 02Ray-based distributed training
- 03OpenAI-compatible decentralized inference
- 04MCP Agent Cloud workflows
- 05cost-optimized H100/A100 access
- pip install ray
- pnpm add openai
| Variable | Scope | Description |
|---|---|---|
| IONET_API_KEY | Server | API key from the io.net dashboard, used for both IO Cloud REST APIs and IO Intelligence inference. |
| IONET_BASE_URL | Server | Base URL for the OpenAI-compatible inference endpoint, e.g. https://api.intelligence.io.solutions/api/v1. |
Use io.net via two surfaces: (1) IO Intelligence for OpenAI-compatible inference — instantiate `new OpenAI({ apiKey: process.env.IONET_API_KEY, baseURL: process.env.IONET_BASE_URL })` and call `client.chat.completions.create({ model: 'meta-llama/Llama-3.3-70B-Instruct', messages })`; (2) IO Cloud for clusters — provision via the REST API or dashboard, pick GPU type and count, then connect a Ray driver: `ray.init(address='ray://<cluster-head>:10001')` and submit `@ray.remote(num_gpus=1)` tasks. For tool-using agents use Agent Cloud's MCP server (`https://mcp.io.net`) so any MCP-compatible client (Claude, Cursor) can launch and monitor jobs. Always parameterize GPU SKU rather than pinning, since spot supply for H100 changes hourly.
- ⚑GPU spot pricing fluctuates intraday — quote per-token cost on every pull and avoid hard-coded $/hour in app config.
- ⚑Region availability is uneven (US/EU strong, APAC patchy); pin a region only if your data has residency constraints, otherwise let the scheduler pick.
- ⚑Network congestion on popular SKUs (H100 SXM) leads to multi-minute queue waits; batch jobs and degrade to A100 when latency-sensitive.
- ⚑Ray clusters on io.net are billed per cluster-hour from provision to teardown — kill idle clusters from the dashboard or via the API to avoid silent burn.
- ⚑Image / multimodal model availability lags text on IO Intelligence; check the model list endpoint before assuming GPT-4o-class vision support.
- ⚑OpenAI compatibility is partial — function calling and tool_choice schemas vary by model; pin a specific model and snapshot its capabilities.
- ⚑Workers are heterogeneous (data-center vs. consumer rigs) — for training that needs NVLink/InfiniBand, filter for `connectivity: high-speed` providers explicitly.