Kaggle Birdclef22 competition

This is a collection of notes about what I tried/other teams found useful during the Kaggle Birdclef22 competition. We were provided a list of audio files with bird calls and were asked to provide a classification model. The issue was some of the bird species only had very few train files. We were provided audio for 152 bird species, although only 21 of them were in the test set.

Things I did

Started with a public baseline, based on efficient net
Ensemble several models
Varied thresholds for birds - the more samples for a bird, the higher the threshold
We were given many species of geese, but only one of them was scored. For training, I merged all geese into a single class. Similarly for other families of birds.
For species with very few samples, hand pick the calls.
Mel spectrogram is faster with torch audio
Used linear schedule with warmup

Learned from reading other people's solutions:

having good CV is important
precompute mel spectrograms for additional speedup
use external data (for example previous competitions)
add human into the loop - create a model, have it make predictions, check top 2000 predictions by hand for "clean" data
similarly, use pseudo-labelling without human in the loop
other models people tried: dm_nfnet_f0, eca_nfnet_l0, eca_nfnet_l1, tf_efficientnetv2_m_in21k, seresnext50_32x4d, resnest50d_4s2x40d, convnext_tiny, resnet34, tf_efficientnetv2_s_in21kk, seresnext26t_32x4d
use prediction on time interval [t, t+5] together with [t-1, t+4] and [t+1, t+6] (potentially with uneqaul weights)
if bird is predicted anywhere in the audio file, lower threshold for the file
add random sounds and have a new class "nocall"
pretrain the models on 2021 data
manually drop segments without birds sounds. Split data to smaller chunks.
first train on all birds, then fine tune on the scored birds only
mask time / frequency bands in the mel spectrogram
bird net pretrained model is verygood
Constant Q-transform (CQT1992v2 from nnAudio.Spectrogram) might be better than mel spectrogram

Kaggle Birdclef22 competition

31 May 2022