Thyme
Getting Started

Quick Start

Get from zero to a working feature query in under five minutes.

Prerequisites

  • Thyme SDK installed (pip install thyme-sdk)
  • Infrastructure running (make infra && make migrate)
  • All three services running (definition-service, engine, query-server)

1. Define your features

Create features.py:

from datetime import datetime
from thyme.dataset import dataset, field
from thyme.pipeline import pipeline, inputs, Avg, Count
from thyme.featureset import featureset, feature, extractor
from thyme.featureset import extractor_inputs, extractor_outputs
from thyme.connectors import IcebergSource, source

# Raw event stream
@source(
    IcebergSource(catalog="prod", database="events", table="transactions"),
    cursor="ts",
    every="1m",
    disorder="5m",
    cdc="append",
)
@dataset(index=True)
class Transaction:
    user_id: str      = field(key=True)
    amount:  float
    ts:      datetime = field(timestamp=True)

# Aggregated stats (computed by the engine)
@dataset(index=True)
class UserStats:
    user_id:       str      = field(key=True)
    ts:            datetime = field(timestamp=True)
    avg_amount_7d: float
    txn_count_30d: int

    @pipeline(version=1)
    @inputs(Transaction)
    def compute(cls, t: Transaction):
        return (
            t.groupby("user_id")
             .aggregate(
                 avg_amount_7d=Avg(of="amount", window="7d"),
                 txn_count_30d=Count(of="user_id", window="30d"),
             )
        )

# Feature set exposed to models
@featureset
class UserFeatures:
    uid:           str   = feature(id=1)
    avg_spend_7d:  float = feature(id=2)
    txn_count_30d: int   = feature(id=3)

    @extractor
    @extractor_inputs("uid")
    @extractor_outputs("avg_spend_7d", "txn_count_30d")
    def from_stats(cls, ts, inputs):
        uid = inputs["uid"]
        row = UserStats.lookup(ts, user_id=uid)
        return row["avg_amount_7d"], row["txn_count_30d"]

2. Commit to the control plane

thyme commit features.py
# Committed 2 dataset(s), 1 pipeline(s), 1 featureset(s), 1 source(s) to http://localhost:8080/api/v1/commit

3. Query features

curl "http://localhost:8081/features?featureset=UserFeatures&uid=user_42"
{
  "uid": "user_42",
  "avg_spend_7d": 47.32,
  "txn_count_30d": 18
}

Next steps

See First Feature for a detailed walkthrough of each concept, or jump to Concepts to understand the building blocks.

On this page