Convention Over Configuration
Here’s what frustrated me about ML frameworks:
- PyTorch: Too low-level, write your own training loops
- PyTorch Lightning: Better, but boilerplate-heavy
- Keras: Great API, but locked to TensorFlow (historically)
- Hugging Face: Amazing for transformers, but domain-specific
I wanted: Write the architecture, point to data, get training. No boilerplate.
NeuroScript now has this. Here’s training XOR:
# 1. Write the architecture (01-xor.ns)
neuron XOR():
in: [batch, 2]
out: [batch, 1]
graph:
in ->
Linear(2, 4)
ReLU()
Linear(4, 1)
Sigmoid()
out
# 2. Create training data (xor_train.jsonl)
{"input": [0.0, 0.0], "target": [0.0]}
{"input": [0.0, 1.0], "target": [1.0]}
{"input": [1.0, 0.0], "target": [1.0]}
{"input": [1.0, 1.0], "target": [0.0]}
# 3. Write a minimal config (xor_config.yml)
model:
neuron: XOR
file: examples/01-xor.ns
data:
train: examples/data/xor_train.jsonl
training:
epochs: 1000
lr: 0.01
# 4. Train
python -m neuroscript_runtime.runner train --config xor_config.yml
# That's it!
What makes this work? Convention over Configuration
The runner:
-
Infers the task from input/output shapes:
[batch, 2] -> [batch, 1]= Regression -> MSE loss[batch, seq] -> [batch, seq, vocab]= Language Model -> CrossEntropy[batch, C, H, W] -> [batch, classes]= Image Classification
-
Picks sensible defaults:
- Optimizer: Adam (good for most things)
- Batch size: 32
- Logging: Every 100 steps
- Checkpointing: Every 1000 steps
-
Makes extension trivial:
from neuroscript_runtime.contracts import DataLoaderContract, ContractRegistry class MyHuggingFaceLoader(DataLoaderContract): # Implement interface pass # Register it ContractRegistry.register_dataloader("huggingface", MyHuggingFaceLoader) # Use in config: # data: # format: huggingface # dataset: "wikitext"
The Contract System is the secret sauce. Five extension points:
- DataLoader: How to load data (default: JSONL files)
- Loss: How to compute error (default: inferred from task)
- Optimizer: How to update weights (default: Adam)
- Checkpoint: How to save/load (default:
torch.save) - Logger: How to track progress (default: console)
Ship with one good default for each. Make it trivial to swap in custom implementations. Let the community build the ecosystem.
The full Python API:
from neuroscript_runtime.runner_v2 import train_from_config
from xor_model import XOR # Generated by NeuroScript compiler
model = XOR()
runner = train_from_config(model, "config.yml")
# Inference
import torch
result = runner.infer(torch.tensor([[1.0, 0.0]]))
print(result) # [0.9999] ≈ 1.0
Why this matters: You can go from idea to trained model in minutes, not hours. When you need custom behavior, the extension points are obvious. When you need full control, it’s just PyTorch under the hood—drop down anytime.
Batteries included. Escape hatch provided.