API for Datasets

Base Dataset Module

Dataset define the data format and provide helpers for generating complementary labels.

Complementary-label Dataset Module

Here, we exclusively showcase CLMNIST and CLCIFAR10 as an illustration for synthetic and real-world datasets respectively, given that other datasets in libcll share the same attributes, functions, and initialization procedures.

class libcll.datasets.CLMNIST(root='./data/mnist', train=True, transform=ToTensor(), target_transform=None, download=True)[source]

Bases: MNIST, CLBaseDataset

Parameters:
  • root (str) – path to store dataset file.

  • train (bool) – training set if True, else testing set.

  • transform (callable, optional) – a function/transform that takes in a PIL image and returns a transformed version.

  • target_transform (callable, optional) – a function/transform that takes in the target and transforms it.

  • download (bool) – if true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

data

the feature of sample set.

Type:

Tensor

targets

the complementary labels for corresponding sample.

Type:

Tensor

true_targets

the ground-truth labels for corresponding sample.

Type:

Tensor

num_classes

the number of classes.

Type:

int

input_dim

the feature space after data compressed into a 1D dimension.

Type:

int

classmethod build_dataset(dataset_name=None, train=True, num_cl=0, transition_matrix=None, noise=None, seed=1126)[source]
class libcll.datasets.CLCIFAR10(root='./data/cifar10', train=True, transform=None, target_transform=None, download=True, num_cl=1)[source]

Bases: CIFAR10, CLBaseDataset

Real-world complementary-label dataset. Call gen_complementary_target() if you want to access synthetic complementary labels.

Parameters:
  • root (str) – path to store dataset file.

  • train (bool) – training set if True, else testing set.

  • transform (callable, optional) – a function/transform that takes in a PIL image and returns a transformed version.

  • target_transform (callable, optional) – a function/transform that takes in the target and transforms it.

  • download (bool) – if true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

  • num_cl (int) – the number of real-world complementary labels of each data chosen from [1, 3].

data

the feature of sample set.

Type:

Tensor

targets

the complementary labels for corresponding sample.

Type:

Tensor

true_targets

the ground-truth labels for corresponding sample.

Type:

Tensor

num_classes

the number of classes.

Type:

int

input_dim

the feature space after data compressed into a 1D dimension.

Type:

int

classmethod build_dataset(dataset_name=None, train=True, num_cl=0, transition_matrix=None, noise=None, seed=1126)[source]

Complementary-label Dataset Utilities