S3T: Self-supervised pre-training with Swin Transformer for music classification

Self-supervised pre-training method with Swin Transformer for music classification, leveraging massive unlabeled music data to improve the performance of music classification and reduce the dependence on a considerable amount of labeled music data.

BibTex: