API for Datasets
Base Dataset Module
Dataset define the data format and provide helpers for generating complementary labels.
Complementary-label Dataset Module
Here, we exclusively showcase CLMNIST and CLCIFAR10 as an illustration for synthetic and real-world datasets respectively, given that other datasets in libcll share the same attributes, functions, and initialization procedures.
- class libcll.datasets.CLMNIST(root='./data/mnist', train=True, transform=ToTensor(), target_transform=None, download=True)[source]
Bases:
MNIST,CLBaseDataset- Parameters:
root (str) – path to store dataset file.
train (bool) – training set if True, else testing set.
transform (callable, optional) – a function/transform that takes in a PIL image and returns a transformed version.
target_transform (callable, optional) – a function/transform that takes in the target and transforms it.
download (bool) – if true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
- data
the feature of sample set.
- Type:
Tensor
- targets
the complementary labels for corresponding sample.
- Type:
Tensor
- true_targets
the ground-truth labels for corresponding sample.
- Type:
Tensor
- num_classes
the number of classes.
- Type:
int
- input_dim
the feature space after data compressed into a 1D dimension.
- Type:
int
- class libcll.datasets.CLCIFAR10(root='./data/cifar10', train=True, transform=None, target_transform=None, download=True, num_cl=1)[source]
Bases:
CIFAR10,CLBaseDatasetReal-world complementary-label dataset. Call
gen_complementary_target()if you want to access synthetic complementary labels.- Parameters:
root (str) – path to store dataset file.
train (bool) – training set if True, else testing set.
transform (callable, optional) – a function/transform that takes in a PIL image and returns a transformed version.
target_transform (callable, optional) – a function/transform that takes in the target and transforms it.
download (bool) – if true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
num_cl (int) – the number of real-world complementary labels of each data chosen from [1, 3].
- data
the feature of sample set.
- Type:
Tensor
- targets
the complementary labels for corresponding sample.
- Type:
Tensor
- true_targets
the ground-truth labels for corresponding sample.
- Type:
Tensor
- num_classes
the number of classes.
- Type:
int
- input_dim
the feature space after data compressed into a 1D dimension.
- Type:
int