The real-time feature platform for machine learning

Every feature, right on
thyme.

Define ML features once in Python. Thyme compiles them to a high-throughput Rust streaming engine — real-time serving, point-in-time correct training, zero skew between the two.

Book a Demo Read the Docs

Developer Experience

Define your features in idiomatic Python

Powerful data engineering workflows, without the infrastructure headaches. Powered by Rust.

Time-windowed aggregations (1m, 24h, 7d) run on a continuous Rust streaming engine. Values are updated within milliseconds of new events arriving - a kappa based architecture that is constantly streaming fresh data.

features.py

1from thyme import *
2
3@source(name="transactions")
4class Transaction:
5    user_id: str = field(key=True)
6    ts: datetime = field(timestamp=True)
7    amount: float
8
9@dataset(index=True)
10class UserSpend:
11    user_id: str = field(key=True)
12    avg_24h: float
13    avg_7d: float
14
15    @pipeline(version=1)
16    @inputs(Transaction)
17    def compute(cls, t):
18        return t.groupby("user_id").aggregate(
19            avg_24h=Avg(of="amount", window="24h"),
20            avg_7d=Avg(of="amount", window="7d"),
21        )

The Problem

ML infrastructure is painful

Every team building real-time ML hits the same wall. Training features and serving features drift apart, and accuracy quietly erodes in production.

Training/serving skew

Offline metrics look great. Production accuracy drops within weeks — not because the model is wrong, but because the features it sees in production are computed differently than the features it trained on.

Diverging pipelines

Batch jobs (Spark, dbt) compute training features. Streaming systems (Flink, microservices) compute serving features. A bug fix in one doesn't propagate to the other. The logic drifts.

Silent accuracy drops

Batch pipelines run on schedules — hourly, daily. A user's last transaction was 4 minutes ago, but your model sees yesterday's aggregate. You're serving predictions on stale data.

Thyme runs one pipeline. Training and serving read the same state — skew is structurally impossible, not a convention you enforce in review.

Read the full story

Features

Everything your ML pipeline needs

From feature computation to serving, Thyme handles the entire lifecycle so your team can focus on building great models.

Rust-Powered Engine

Features defined in Python are compiled to a high-throughput Rust streaming engine. Real-time aggregations with millisecond freshness.

Time-Travel Queries

Point-in-time correct feature retrieval for training. Query any feature exactly as it was known at any past moment.

Zero Training/Serving Skew

One definition, two modes. The same feature logic runs in both streaming aggregation and offline point-in-time lookups — no divergence, no silent accuracy drops.

Datasets, Pipelines & Extractors

Composable abstractions: datasets define event streams, pipelines apply windowed aggregations, and extractors compute derived features on read.

Exactly-Once Semantics

Distributed leasing, checkpointing, and replay logs ensure exactly-once processing with no data loss or duplication.

Declarative, Not Operational

No Kafka consumers to manage, no state stores to tune, no checkpoint recovery to handle. You own the feature logic — Thyme owns the infrastructure.

Architecture

Two paths, one definition

A streaming write path keeps features fresh; a query-time read path composes them for your model. Both paths read the same event-time-keyed state, so training and serving cannot drift.

WRITE PATH ▸continuous ingestion

on query▸ READ PATH

Source

Streaming

Kafka · Kinesis

Source

Polling

Postgres · Iceberg · S3

Dataset

Raw Dataset

event-time keyed

Pipeline

Sum · Count · Avg · Min · Max

Shared state

Aggregated Dataset

event-time · exactly-once

HTTP

Pipeline

Query Server

Pipeline

Extractor

composes features

Featureset

Response

online · point-in-time

Performance

Built for simplicity and speed

Thyme compiles Python feature definitions to a Rust streaming engine. Low latency, zero skew, and a three-command deployment workflow.

<0ms

P99 Online Latency

Definition for Online & Offline

Training/Serving Skew

Commands to Deploy

features.py

from thyme import *

@dataset(index=True)
class UserStats:
    user_id: str = field(key=True)
    ts: datetime = field(timestamp=True)
    avg_spend_7d: float

    @pipeline(version=1)
    @inputs(Transaction)
    def compute(cls, t):
        return t.groupby("user_id").aggregate(
            avg_spend_7d=Avg(of="amount", window="7d")
        )

Define features in Python. Deploy with thyme commit. Serve in milliseconds.

It's about thyme you upgraded
your feature platform

Join the teams shipping ML features faster with Thyme. Get up and running in minutes, not months.

Book a Demo

Every feature, right onthyme.

Define your features in idiomatic Python

ML infrastructure is painful

Training/serving skew

Diverging pipelines

Silent accuracy drops

Everything your ML pipeline needs

Rust-Powered Engine

Time-Travel Queries

Zero Training/Serving Skew

Datasets, Pipelines & Extractors

Exactly-Once Semantics

Declarative, Not Operational

Two paths, one definition

Built for simplicity and speed

It's about thyme you upgradedyour feature platform

Every feature, right on
thyme.

It's about thyme you upgraded
your feature platform