rllm
Democratizing Reinforcement Learning for LLMs
Source#
- Repository: rllm-org/rllm
- Last source update: 2026-04-05
- Last verified: 2026-04-05
Tags#
Integration notes#
Repository is workflow-oriented; map each workflow step to explicit worker contracts for predictability.
worker.md example#
Starter worker.md contract mapped from this registry entry. Copy this file and adapt schemas, constraints, and statuses for your task.
---
id: rllm-repo-derived-worker
name: rllm Repo-Derived Worker
version: 1.0.0
source_registry_url: https://worker.md/registry/rllm/
source_repository: https://github.com/rllm-org/rllm
repository_default_branch: main
repository_language: Python
repository_license: Apache-2.0
repository_updated_at: 2026-04-05
worker_mode: agent-orchestration-worker
derivation_method: github_repository_metadata_plus_raw_readme
derivation_confidence: 0.95
derived_on: 2026-04-05
tags:
- agent-framework
- agentic-workflow
- coding-agent
- distributed-training
- llm-reasoning
- llm-training
---
# rllm Repo-Derived Worker
## Repo-derived summary
- Registry summary: Democratizing Reinforcement Learning for LLMs
- Repository description: Democratizing Reinforcement Learning for LLMs
- Stars (snapshot): 5,376
- Primary language: Python
- Worker mode classification: agent-orchestration-worker
## Extracted from
- https://github.com/rllm-org/rllm
- https://github.com/rllm-org/rllm/blob/main/README.md
- https://img.shields.io/badge/Documentation-blue?style=for-the-badge&logo=googledocs&logoColor=white
- https://img.shields.io/pypi/v/rllm?style=for-the-badge
- https://docs.rllm-project.com/
## Evidence notes (from repository text)
- README summary paragraph: **Train your AI agents with RL. Any framework. Minimal code changes.**
- **Train your AI agents with RL. Any framework. Minimal code changes.**
- rLLM is an open-source framework for training AI agents with reinforcement learning. Swap in a tracked client, define a reward function, and let RL handle the rest — no matter what agent framework you use.
- - **Works with any agent framework** — LangGraph, SmolAgent, Strands, OpenAI Agents SDK, Google ADK, or plain `openai.OpenAI`. Just swap the client. 🔌
- - **Near-zero code changes** — Add `@rllm.rollout` to wrap your agent code, and rLLM traces every LLM call automatically. 🪄
- - **CLI-first workflow** — Eval and train from the command line with 50+ built-in benchmarks. `rllm eval gsm8k` just works. ⚡
## Installation hints found in README
- `pip install "rllm @ git+https://github.com/rllm-org/rllm.git"`
- `pip install rllm[verl] @ git+https://github.com/rllm-org/rllm.git`
- `uv pip install "rllm @ git+https://github.com/rllm-org/rllm.git"`
- `uv pip install rllm[verl] @ git+https://github.com/rllm-org/rllm.git`
## worker.md contract (derived starter)
Purpose: Execute one orchestrated agent task as a bounded worker step.
### Input schema
```json
{
"type": "object",
"additionalProperties": false,
"required": [
"run_id",
"task",
"context"
],
"properties": {
"run_id": {
"type": "string"
},
"task": {
"type": "string"
},
"context": {
"type": "object"
}
}
}
```
### Output schema
```json
{
"type": "object",
"additionalProperties": false,
"required": [
"run_id",
"status",
"result"
],
"properties": {
"run_id": {
"type": "string"
},
"status": {
"type": "string",
"enum": [
"ok",
"retryable_error",
"invalid_request",
"invalid_output"
]
},
"result": {
"type": "object"
}
}
}
```
### Constraints
- timeout_seconds: 30
- max_attempts: 2
- idempotency_key: run_id
- status_enum: [ok, retryable_error, invalid_request, invalid_output]
- notes: adapt to concrete APIs/classes documented in this repository before production use
## How this should be used
1. Treat this file as a repo-derived starter profile, not a claim of an official repository API contract.
2. Replace schemas with exact interfaces from code/docs you adopt.
3. Keep execution bounded and auditable using worker protocol constraints.
How to use#
- Save this as a worker spec file (for example:
rllm-my-task.worker.md). - Replace the input/output schemas and purpose with your real bounded task.
- Enforce schema validation + timeout + retry policy in your runtime before production use.
Citation#
Reference URL: https://worker.md/registry/rllm/
Source URL: https://github.com/rllm-org/rllm