There are two challenge tracks. In the low bit-rate track, images need to be compressed to below 0.15 bits per pixel (bpp). This is the same task as in previous years, which allows us to measure progress over the years. As a first step towards video compression, this year also includes a P-frame track. Here, video P-frames need to be predicted from a previous frame.
For the low bit-rate track (which is similar to the one we ran at CLIC 2018), contestants will be asked to compress the entire dataset to 0.15 bpp or smaller. The winners of the competition will be chosen based human perceptual rating task and will be asked to give a short talk at the CLIC workshop. PSNR and MS-SSIM will be evaluated but not considered for prizes. We will provide last year’s professional and mobile datasets (all splits) as the training data for this challenge track. A new test set will be generated for this year and released during the test phase.
We provide two training datasets: Dataset P (“professional”) and Dataset M (“mobile”). The datasets are collected to be representative for images commonly used in the wild, containing around two thousand images. Training on additional data is allowed. (The data we provide should be enough for good results but we expect participants to have access to additional data.)
Participants will need to submit a decoder and encoded image files. The test dataset is going to be released at a later point. To ensure that the decoder is not optimized for the test set, only decoders submitted during the validation phase can be used during the test phase.
The challenge data is released by the Computer Vision Lab of ETH Zurich, and can be downloaded here:
The total size of all compressed images should not exceed 4,722,341 bytes for the validation set for the low-rate track. This year we are also limiting the model size, which should not exceed 500MB.
The P-frame challenge will require entrants to compress a video frame conditioned on the previous image frame. Instead of splitting the dataset into training and test sets, in this track the entire dataset is released before the test phase. To discourage overfitting, the model size is added to the compressed dataset size and the sum cannot exceed a target bit-rate. That is, participants should try to minimize both the dataset size and the model size. The winner will be determined based on MS-SSIM.
We release a dataset of 909 videos of the UGC Dataset, with a total of ~560,000 frames. Each video is released as a zip file, and each zip file contains PNGs representing the frames of the video. Each frame is represented by 3 PNGs, one for each channel of a YUV encoding. This format was chosen because the Y-channel has twice the resolution of the other two channels.
To download, you can use the
download.sh script from the devkit (downloads 382.2GB) to download the data to
bash download.sh <OUTDIR>
You can have a look at a subset of the data by only running
bash download.sh <OUTDIR> --max_videos 10
In the validation and test phases, we will evaluate your submissions on a randomly chosen subset of the data presented above. The total combined size of the compressed data and model size will be estimated as follows:
model_size + 100 * data_size.
If you already have the training set, you may use the following .txt files to filter for the subset:
The first file contains the names of target frames evaluated during the validation phase. Decoders should reproduce these files. The second file contains the names of corresponding input files. These files will be available in the working directory of the decoder on the evaluation server and do not need to be uploaded with the decoder.
If you want to download only the validation set, you can use this direct link: