The operating system for machine learning.

Thesis helps you improve models systematically.

From hypothesis to measurable improvement.

Turn ideas into experiments, run them end-to-end, and ship the best-performing version — with clear, tracked deltas.

Learn about Agent Mode
Hypothesis → Delta
Baseline. Change. Measure.
fraud.py
Baseline (naive split)
Running…
1tx = load_table("card_transactions.parquet")
2train, val = split(tx, method="random", test_size=0.2)
3
4X_train, y_train = featurize(train, label="is_fraud")
5X_val, y_val = featurize(val, label="is_fraud")
6
7clf = fit(
8 model="xgb_classifier",
9 X=X_train,
10 y=y_train
11)
12
13p = clf.predict_proba(X_val)[:, 1]
14auc = pr_auc(y_val, p)
15print(auc)

Three modes for every workflow

Ask mode for safe exploration without changes. Plan mode for research and structured planning. Agent mode for full autonomous execution. You control how much independence to give the AI.

Learn about Modes

Your prompt:

“Build a sentiment classifier for product reviews”

Ask
Read-only

For sentiment analysis, you'll want a transformer-based model like BERT or DistilBERT. Fine-tune on your labeled dataset, then evaluate with precision/recall metrics...

Plan
Planning

I'll build a sentiment classifier using DistilBERT...

Plan
  1. 1.Load and preprocess the reviews dataset
  2. 2.Fine-tune DistilBERT for classification
  3. 3.Evaluate on held-out test set
  4. 4.Export model for inference
Agent
Execution

I'll build a sentiment classifier using DistilBERT...

Plan
  1. 1.Load and preprocess the reviews dataset
  2. 2.Fine-tune DistilBERT for classification
  3. 3.Evaluate on held-out test set
  4. 4.Export model for inference
Changes made
├──data/preprocessing.py(new)
├──models/sentiment_classifier.py(new)
├──train.py(+45 lines)

Choose your intelligence

On-premise

Run Thesis entirely on your own machine or cluster using Ollama. Your data never leaves your environment.

Learn about on-prem
RuntimeLocal
Modelllama3.1:8b
Datalocal only
Network0 external calls
Ideal for regulated or private data.

Frontier models

Access cutting-edge models from OpenAI, including reasoning models optimized for complex ML workflows.

Explore models
GPT-4oSuggested
o1-previewReasoning
o3-miniReasoning
Switch models per task or per run.

Integrated web search

Let the agent pull in documentation, papers, and live information while it works — without breaking context.

Learn about search
PyTorch DataLoader best practices
Searching documentation…

CLI-first workflow

Run Thesis programmatically from the command line. Perfect for automation, CI/CD, and large-scale experiments.

View CLI docs

$ thesis exec \

"train a fraud model with time-split eval" \

--model o1-preview \

--mode agent

Try Thesis now.

Download for macOS