Instructions for use:
Click on Show per dataset for the visual storytellings per data where the final embeddings obtained by using SM, SNE and t-SNE (per row) are shown at different iteration (per column).
Click on Show per technique for the visual storytellings per technique where the final embeddings of different data (per row) are shown at different iteration (per column).
Click on From 2D to 1D or From 3D to 2D for the visual storytellings per low-dimensional data. N.B.: the final embeddings in From 2D to 1D are 1-dimensioanal spaces where the data points have been jittered for preventing overplotting.
Click on Clear after every visualizion for displaying the next one.
Interact with the visualizations by using the brushing and linking technique to find interesting data patterns.
The data sets used are:
Artificial Data:
Clustered data - 1000x100 - Labels: 7 clusters
Non-clustered data - 1000x40 - Labels: 5 classes randomly assigned
Clustered data - 2D - 1600x2 - Labels: 4 clusters
Circle data - 2D - 1671x2 - Labels: 2 classes (circle + centre)
Mickey Mouse data - 2D - 500x2 - Labels: 3 classes + noise
Clustered data - 3D - 1250x3 - Labels: 4 clusters (from 4 different normal distributions)
Swiss Roll data - 3D - 1600x3 - Labels: 4 clusters
Real Data:
Ionosphere data - 351x34 - Labels: 'Good'/'Bad'
Churn data (private data set) - 5000x17 - Labels: 'Churn'/'No churn'
Semeion Handwritten Digit data - 1593x256 - No labels available
Mushroom data - 3000x23 - Labels: 'Edible/Poisonous'
Breast Cancer Wisconsin (Diagnostic) data - 569x32 - Labels: 'Malignant/Benign'
This project is based on the great work made by Laurens Van Der Maaten