TabPFN — zero-shot predictions
TabPFN is a pre-trained neural network that makes predictions on tabular data without any training step. It works immediately on your data.When to use TabPFN
- Small to medium datasets (under 10,000 rows works best)
- Quick prototyping — get results in seconds
- No hyperparameter tuning needed
- Classification and regression tasks
Available modes
| Mode | Chat mode | Description |
|---|---|---|
| Classification | Classify | Predict categorical outcomes (churn/no churn, fraud/legitimate) |
| Regression | Predict | Predict continuous values (price, score, duration) |
scikit-learn — full ML pipeline
For larger datasets or when you need more control, Sourcetable uses scikit-learn under the hood.Classification
| Algorithm | Best for |
|---|---|
| Random Forest | General purpose, handles mixed features well |
| Gradient Boosting | High accuracy, handles non-linear relationships |
| Logistic Regression | Interpretable, good baseline |
| SVM | High-dimensional data, clear margin of separation |
| k-Nearest Neighbors | Simple, non-parametric |
| Decision Tree | Interpretable, visual output |
| Naive Bayes | Text classification, very fast |
Regression
Clustering
| Algorithm | Best for |
|---|---|
| K-Means | Spherical clusters, known number of groups |
| DBSCAN | Arbitrary shapes, automatic cluster count |
| Hierarchical | Dendrogram visualization, nested groups |
| Gaussian Mixture | Overlapping clusters, soft assignments |
Dimensionality reduction
End-to-end ML pipeline
When you ask the AI to build a model, it automatically handles:- Data splitting — train/test split (default 80/20)
- Feature preprocessing — encoding categoricals, scaling numerics, handling missing values
- Model training — fits the chosen algorithm
- Evaluation — generates metrics and visualizations
- Results — writes predictions back to your spreadsheet
Model evaluation
The AI reports relevant metrics based on the task: Classification metrics:- Accuracy, Precision, Recall, F1 Score
- ROC curve and AUC
- Confusion matrix
- Classification report by class
- R-squared and Adjusted R-squared
- MAE (Mean Absolute Error)
- RMSE (Root Mean Squared Error)
- Residual plots
- Silhouette score
- Calinski-Harabasz index
- Inertia (for K-Means)