There are two challenge tracks. In the low bit-rate track, images need to be compressed to below 0.15 bits per pixel (bpp). This is the same task as in previous years, which allows us to measure progress over the years. As a first step towards video compression, this year also includes a P-frame track. Here, video P-frames need to be predicted from a previous frame at a target bit rate of 0.075 bpp.

Low-rate compression

For the low bit-rate track (which is similar to the one we ran at CLIC 2018), contestants will be asked to compress the entire dataset to 0.15 bpp or smaller. The winners of the competition will be chosen based human perceptual rating task and will be asked to give a short talk at the CLIC workshop. PSNR and MS-SSIM will be evaluated but not considered for prizes. We will provide last year’s professional and mobile datasets (all splits) as the training data for this challenge track. A new test set will be generated for this year and released during the test phase.


We provide two training datasets: Dataset P (“professional”) and Dataset M (“mobile”). The datasets are collected to be representative for images commonly used in the wild, containing around two thousand images. Training on additional data is allowed. (The data we provide should be enough for good results but we expect participants to have access to additional data.)

Participants will need to submit a decoder and encoded image files. The test dataset is going to be released at a later point. To ensure that the decoder is not optimized for the test set, only decoders submitted during the validation phase can be used during the test phase.

The challenge data is released by the Computer Vision Lab of ETH Zurich, and can be downloaded here:

The total size of all compressed images should not exceed 4,722,341 bytes for the validation set of the low-rate track. For the test set, the total size should stay below 22,540,456 bytes. This year we are also limiting the model size, which should not exceed 500MB.


The CLIC Mobile datasets are released under the CC-0 License. The CLIC Professional datasets are released under the Unsplash license.

P-frame compression

The P-frame challenge will require entrants to compress a video frame conditioned on the previous (uncompressed) image frame. Instead of splitting the dataset into training and test sets, in this track the entire dataset is released before the test phase. To discourage overfitting, the model size is added to the compressed dataset size and the sum cannot exceed a target bit-rate. That is, participants should try to minimize both the dataset size and the model size. The winner will be determined based on MS-SSIM.


We release a dataset of 739 videos of the UGC Dataset, with a total of ~466684 frames. Each video is released as a zip file, and each zip file contains PNGs representing the frames of the video. Each frame is represented by 3 PNGs, one for each channel of a YUV encoding. This format was chosen because the Y-channel has twice the resolution of the other two channels.

To download, you can use the download.sh script from the devkit (downloads ~250GB) to download the data to <OUTDIR>:

bash download.sh <OUTDIR>

You can have download a subset of the data by only running

bash download.sh <OUTDIR> --max_videos 10


The CLIC P-Frame dataset is released under CC0 License.

Validation and test phases

In the validation and test phases, we will evaluate your submissions on a randomly chosen subset of the data presented above. The total combined size of the compressed data and model size will be estimated as follows: model_size + 100 * data_size. This total size should not exceed 3,900,000,000 bytes (about 0.075 bpp).

If you want to download only the validation set, you can use this direct link:

If you already have the training set, you may use the following .txt files to filter for the subset:

The first file contains the names of target frames evaluated during the validation phase. Decoders should reproduce these files. The second file contains the names of corresponding input files. These files will be available in the working directory of the decoder on the evaluation server and do not need to be uploaded with the decoder.

Data generation

The P-frame dataset is based on the UGC Dataset. We filtered all videos that where smaller than 720p, and also removed HDR, vertical video, and interlaced videos. The frames were generated with the following ffmpeg commands:

ffmpeg -i ${VIDEO_NAME}.mkv -vf "extractplanes=y,scale=1280:720" ${VIDEO_NAME}/${VIDEO_NAME}_%05d_y.png ffmpeg -i ${VIDEO_NAME}.mkv -vf "extractplanes=u,scale=640:360" ${VIDEO_NAME}/${VIDEO_NAME}_%05d_u.png ffmpeg -i ${VIDEO_NAME}.mkv -vf "extractplanes=v,scale=640:360" ${VIDEO_NAME}/${VIDEO_NAME}_%05d_v.png