rllm

Democratizing Reinforcement Learning for LLMs

Automation platform 5,376 stars Python Apache-2.0 Worker-compatible

Source#

Repository: rllm-org/rllm
Last source update: 2026-04-05
Last verified: 2026-04-05

Tags#

agent-frameworkagentic-workflowcoding-agentdistributed-trainingllm-reasoningllm-training

Integration notes#

Repository is workflow-oriented; map each workflow step to explicit worker contracts for predictability.

worker.md example#

Starter worker.md contract mapped from this registry entry. Copy this file and adapt schemas, constraints, and statuses for your task.

---
id: rllm-repo-derived-worker
name: rllm Repo-Derived Worker
version: 1.0.0
source_registry_url: https://worker.md/registry/rllm/
source_repository: https://github.com/rllm-org/rllm
repository_default_branch: main
repository_language: Python
repository_license: Apache-2.0
repository_updated_at: 2026-04-05
worker_mode: agent-orchestration-worker
derivation_method: github_repository_metadata_plus_raw_readme
derivation_confidence: 0.95
derived_on: 2026-04-05
tags:
  - agent-framework
  - agentic-workflow
  - coding-agent
  - distributed-training
  - llm-reasoning
  - llm-training
---

# rllm Repo-Derived Worker

## Repo-derived summary
- Registry summary: Democratizing Reinforcement Learning for LLMs
- Repository description: Democratizing Reinforcement Learning for LLMs
- Stars (snapshot): 5,376
- Primary language: Python
- Worker mode classification: agent-orchestration-worker

## Extracted from
- https://github.com/rllm-org/rllm
- https://github.com/rllm-org/rllm/blob/main/README.md
- https://img.shields.io/badge/Documentation-blue?style=for-the-badge&logo=googledocs&logoColor=white
- https://img.shields.io/pypi/v/rllm?style=for-the-badge
- https://docs.rllm-project.com/

## Evidence notes (from repository text)
- README summary paragraph: **Train your AI agents with RL. Any framework. Minimal code changes.**
- **Train your AI agents with RL. Any framework. Minimal code changes.**
- rLLM is an open-source framework for training AI agents with reinforcement learning. Swap in a tracked client, define a reward function, and let RL handle the rest — no matter what agent framework you use.
- - **Works with any agent framework** — LangGraph, SmolAgent, Strands, OpenAI Agents SDK, Google ADK, or plain `openai.OpenAI`. Just swap the client. 🔌
- - **Near-zero code changes** — Add `@rllm.rollout` to wrap your agent code, and rLLM traces every LLM call automatically. 🪄
- - **CLI-first workflow** — Eval and train from the command line with 50+ built-in benchmarks. `rllm eval gsm8k` just works. ⚡

## Installation hints found in README
- `pip install "rllm @ git+https://github.com/rllm-org/rllm.git"`
- `pip install rllm[verl] @ git+https://github.com/rllm-org/rllm.git`
- `uv pip install "rllm @ git+https://github.com/rllm-org/rllm.git"`
- `uv pip install rllm[verl] @ git+https://github.com/rllm-org/rllm.git`

## worker.md contract (derived starter)
Purpose: Execute one orchestrated agent task as a bounded worker step.

### Input schema
```json
{
  "type": "object",
  "additionalProperties": false,
  "required": [
    "run_id",
    "task",
    "context"
  ],
  "properties": {
    "run_id": {
      "type": "string"
    },
    "task": {
      "type": "string"
    },
    "context": {
      "type": "object"
    }
  }
}
```

### Output schema
```json
{
  "type": "object",
  "additionalProperties": false,
  "required": [
    "run_id",
    "status",
    "result"
  ],
  "properties": {
    "run_id": {
      "type": "string"
    },
    "status": {
      "type": "string",
      "enum": [
        "ok",
        "retryable_error",
        "invalid_request",
        "invalid_output"
      ]
    },
    "result": {
      "type": "object"
    }
  }
}
```

### Constraints
- timeout_seconds: 30
- max_attempts: 2
- idempotency_key: run_id
- status_enum: [ok, retryable_error, invalid_request, invalid_output]
- notes: adapt to concrete APIs/classes documented in this repository before production use

## How this should be used
1. Treat this file as a repo-derived starter profile, not a claim of an official repository API contract.
2. Replace schemas with exact interfaces from code/docs you adopt.
3. Keep execution bounded and auditable using worker protocol constraints.

How to use#

Save this as a worker spec file (for example: rllm-my-task.worker.md).
Replace the input/output schemas and purpose with your real bounded task.
Enforce schema validation + timeout + retry policy in your runtime before production use.

Citation#

Reference URL: https://worker.md/registry/rllm/

Source URL: https://github.com/rllm-org/rllm