⚡ Interactive · Browser-Based · Zero Install

ML Classification
Pipeline

Upload any labeled CSV dataset, engineer features with per-column scaling and transformations, then train and compare multiple classification models with detailed performance analysis — entirely in your browser.

Logistic Regression K-NN Naive Bayes Decision Tree Random Forest ROC · AUC · F1 · Confusion Matrix

📥 Load Your Dataset

Upload any CSV with a binary classification target column. After loading, you'll select which column is the target and which are features.

📂

Drop a CSV file here or click to browse

Any CSV with a header row and a binary target column (0/1, yes/no, true/false, or two distinct text labels).
Numeric and low-cardinality categorical features are both supported.

Supported formats: Standard CSV with a header row. Binary targets may be encoded as numbers (0/1), booleans (true/false), or any two distinct text values (e.g. "Yes"/"No", "Presence"/"Absence", "Benign"/"Malignant"). Feature columns should be numeric or low-cardinality categorical (≤12 unique values — these are one-hot encoded automatically).

🔍 Explore Data

Preview rows, inspect statistics, and understand class balance before modeling.

📭

Load data in Step 1 first

⚙️ Feature Preprocessing

Choose a scaling or transformation for each numeric feature. Defaults are suggested based on each column's distribution.

Tip: Distance-based models (Logistic Regression, K-NN) benefit from StandardScaler or MinMaxScaler on continuous features. Tree-based models (Decision Tree, Random Forest) are scale-invariant. Skewed distributions may benefit from Log1p or Sqrt.
📭

Load data in Step 1 first

🎛️ Configure Models

Select classifiers to train and adjust the train/test split.

Select Models
📐
Logistic Regression
Gradient descent · Linear boundary
🔵
K-Nearest Neighbors
Distance-based · k=7
🔔
Gaussian Naive Bayes
Probabilistic · Feature independence
🌲
Decision Tree
CART · Gini impurity · depth 8
🌳
Random Forest
Ensemble · 20 trees · bagging
Gradient Boosting
Sequential trees · XGBoost-style
🔁
AdaBoost
Weighted stumps · adaptive boosting
✂️
Linear SVM
SGD · hinge loss · max-margin
Train / Test Split
Train Test
Training: 80%  ·  Testing: 20%  ·  Random seed: 100
Options
No experiment variants saved. Go to Step 3 → Preprocessing to save variants and compare feature engineering approaches.

🏋️ Training

Models are trained in-browser using pure JavaScript implementations — no server or Python required.

Waiting to start…

📊 Results

Comprehensive evaluation across all trained classifiers.

📭

Train models first in Step 5

🔮 Predict on New Data

Upload an unlabelled CSV — all trained models will run on every row and their predictions are shown side-by-side so you can compare where models agree and where they diverge.

⚠️ Train at least one model in Step 5 before generating predictions.
Upload Prediction CSV
📂

Drop unlabelled CSV here or click to browse

Must contain the same feature columns as your training set. No target column required — it will be ignored if present.