Rushd Labs — Building the tools agents actually need

01 / Thesis

Agents need infrastructure
that doesn't exist yet

The current AI agent stack is held together with duct tape. Agents talk to the web through browsers built for humans. They receive raw HTML when they need structured data. They execute multi-page tasks as sequential scripts when they need declarative workflows. They run in isolation when they should share resources.

Rushd Labs builds the missing layer between AI agents and the systems they interact with. Not the models. Not the frameworks. The practical infrastructure that turns "it works in a demo" into "it runs unattended overnight."

Every project starts from the same question: what does an agent actually need here, and what is it currently forced to do instead?

The gap between those two answers is where the product lives. Wraith exists because agents need structured data from web pages but get raw HTML. BabyGPT exists because people are told LLMs are magic when every layer can be written in plain NumPy and understood from first principles.

Everything ships open source. Paid tiers cover hosting and custom configurations, not the core tools.

02 / Projects

What we're building

Active development

Wraith

Structured web extraction for AI agents

An orchestration layer that sits on top of headless browsers and turns dumb page loads into smart, targeted extractions. Agents get clean JSON instead of raw HTML. Multi-page workflows run as config files, not scripts.

Define what you need in TOML. Wraith blocks the junk, extracts the signal, and returns structured data your agent can reason over immediately.

Python Playwright FastAPI MCP Server TOML Configs Pydantic

Config-based extraction

TOML files map CSS selectors to output fields. Adding a new site means editing config, not writing code.

Workflow DAGs

Multi-page research loops, price monitors, and auth flows as declarative TOML. Parallel branches, dependency resolution, data passing between steps.

Request filtering

Blocks images, trackers, ads, and fonts at the network level. Pages load faster because the browser never downloads the junk.

MCP server

FastAPI server exposing extract, navigate, batch, links, and workflow tools. Agents connect through the same MCP interface they use for everything else.

Swappable engine

Playwright/Chromium today. Lightpanda or any future engine tomorrow. One interface, any browser underneath.

Shipped

BabyGPT

Learn how LLMs actually work

A complete GPT implementation built from scratch using only NumPy. No PyTorch, no TensorFlow. Every layer of the transformer is written in plain Python so you can see the maths, trace the gradients, and understand what's actually happening inside a language model.

Includes a React + FastAPI web UI for training, attention visualisation, LoRA fine-tuning, and model export. Models from 420K to 124M parameters, all trainable on a laptop CPU.

Python NumPy React TypeScript FastAPI Vite

Hugging Face GitHub

Pure NumPy transformer

Embedding, multi-head attention, feed-forward, layer norm, softmax, backpropagation. All implemented from first principles with no ML framework dependencies.

LoRA fine-tuning

Low-rank adaptation that freezes the base model and trains tiny adapters at ~1% of parameters. Fast enough to run on mobile. Multiple style presets included.

Attention visualisation

Interactive heatmaps showing which tokens attend to which. Layer-by-layer exploration of how the model builds understanding of text.

Interactive web UI

Five-page app: Learn (step-by-step explanations), Train (real-time loss curves), Visualise (attention maps), LoRA (fine-tuning), Download (model export).

Shipped

LDI Crisis Simulator

Autonomous agents stress-testing pension solvency

An AI-driven financial risk engine that simulates the UK pension LDI crisis using autonomous agents. Unlike static backtesting, the system uses generative AI to invent market scenarios, detect insolvency risks in real-time, and execute strategic trades to preserve solvency.

A 30-day simulation loop runs autonomously: a Market Agent moves the gilt yield curve, a Valuation Engine recalculates present values, an Ops Agent issues margin calls, and a Portfolio Manager Agent decides whether to sell, repo, or hold.

n8n PostgreSQL GPT-4o Supabase QuickChart

Agent "Mike" (Ops & Risk)

Monitors funding levels against liabilities. When funding drops below 100%, issues formal SQL-backed margin calls with contextual analysis of the shortfall.

Agent "Sarah" (Portfolio Manager)

Reads margin calls, checks collateral inventory, and autonomously decides: REPO to maintain duration hedge, SELL to survive a cash crunch, or REVERSE_REPO to capture yield.

Dynamic market generation

AI generates daily economic news and moves the 10-year gilt yield curve. No hardcoded scenarios. Every simulation run produces unique market conditions.

Live dashboard

Real-time visualisation of gilt yields vs average fund solvency, trade rationale logs, and insolvency events avoided across the simulation run.

Shipped

Rug-Pull Detector

Autonomous smart contract auditor

Analyses GitHub repositories for rug-pull patterns in Solidity contracts. Submit a repo URL and the system clones it, runs static analysis via Slither, applies custom detection rules across three categories, and returns a risk score from 0 to 100.

Built as a full-stack application with async job processing. Analyses run in background workers while the dashboard updates in real time. Dockerised for one-command deployment.

Python FastAPI React TypeScript Slither Celery PostgreSQL Docker

Owner privilege detection (35%)

Hidden mint functions, pause mechanisms, blacklist functionality, ownership transfer patterns, and selfdestruct capabilities.

Liquidity trap detection (40%)

Anti-sell mechanisms, max transaction limits, honeypot patterns, and cooldown abuse. The highest-weighted category because these directly trap user funds.

Fee manipulation detection (25%)

Hidden transfer fees, dynamic fee adjustments, and fee extraction to owner addresses. Catches the subtle drain patterns that evade manual review.

Severity-weighted risk scoring

0-100 scale from SAFE to CRITICAL. Each finding is weighted by category and severity. Scores above 60 flag significant red flags, above 80 indicates high probability of rug-pull.

Specification complete

Alcove

Desktop GUI installer wrapping OpenClaw for non-technical users. Tauri 2.0, React, TypeScript. Six-step onboarding wizard, freemium model. "OpenClaw without the stress."

Demo

MidCap Alpha

Dual-mode React app: multi-agent Intel Mode with Research Analyst, Execution Strategist, and CIO synthesiser. Academy Mode with 30-concept equities curriculum.

Discovery

RepoAgent

Autonomous repo portfolio manager. £500M book, 12 tools, CTD logic, 8 simulated counterparties. Bloomberg-style cyberpunk terminal UI.

Discovery

Hedge Fund Simulator

Multi-strategy fund with 4 PM agents + CIO meta-agent. $500M AUM, investment committee cycle, Bridgewater-style risk reporting.

RUSHD

Agents need infrastructure
that doesn't exist yet

What we're building

How we build

Built with Rushd

RUSHD

Agents need infrastructurethat doesn't exist yet

What we're building

How we build

Built with Rushd

Agents need infrastructure
that doesn't exist yet