Demo

Demo #

Demo website #

Click here to open the online demo in a new browser tab. You can also try the use cases listed below in our system. The dataset and key views for each use case have been included.

Public Datasets #

Dataset NameDescriptionRelated Use caseDemo url
ImputationThe Mushroom dataset that considers a testing set of 2,000 (of 8,124) randomly selected instances.Model Selectiondemo
IMDB ConfidenceThe data consists of 5044 movies with 27 features, however, 25% is sequestered for final assessment. A stratified sampling of 200 movies per class is held out from the 3756 for testing.Model Selection and Tuningdemo
recidThis dataset is used for fair learning: the Broward County recidivism dataset, popularized by ProPublica. The data set contains 6,172 instances and 14 numeric features (created by one-hot encoding the categorical features in the initial seven feature data set). 20% are held for testing.Fairness Assessmentdemo
date-12000-stratThe dataset is the TCP collection of historical documents. It took a random sample of 12,000 documents, and held out 30% using stratified sampling. While the testing set is balanced (1,800 per class), the training set is highly skewed (only 15% before 1642)Bias and Data Discoverydemo
fuzz-mod-5-02The data set is a collection of 554 plays written in the Early Modern Period (1470-1660). Five linguistic features are used. It contains four kinds of plays : Comedy, History, Tragedy and Tragicomedy.Feature Sensitivity Testingdemo
tcp-tree-select-9-10This dataset considers a corpus of 59,989 documents from a historical literary collection and the data counts the 500 most common English words in each document.Model Selection and Data Discoverydemo
(continuous) heart diseaseThe dataset is a standard data set used in machine learning education. Classifiers are trained to predict if a patient is likely to develop a disease (binary decision).(Continuous) Model Selection and Calibration Analysisdemo
(continuous) wine qualityThe dataset is used for wine quality classification, which requires classifying the quality of a wine from its properties.(Continuous) Hyper parameter Tuningdemo
(continuous) incomeThe dataset comes from income classification benchmark dataset from that has been downsampled. Classifiers determine whether an individual’s income is above a certain level.(Continuous) Model Selectiondemo
(continuous) cifar-sampled-scalingThe datset is created based on CIFAR 100 computer vision benchmark using Tensorflow. The data set has 100 classes, and the trained classifier produces a distribution over these classes as its decision.A binary classifier has been created for a “meta-class” which combines 5 of the main classes. This datasets aims to classify flowers, which can be any one of 5 of the original classes. Because the test set contains all 100 classes, it is quite imbalanced: flowers are only 5% of the total instances(Continuous) Model Selection and Detail Examinationdemo
(continuous) cdate-2500This dataset considers a corpus of 59,989 documents from a historical literary collection: Text Creation Partnership (TCP) transcriptions of the Early English Books Online (EEBO). The data counts the 500 most common English words in each document. For the experiment, we took a random sample of 2500 documents, and held out 30% using stratified sampling.(Continuous) Data Examinationdemo