Chg2Cap: A Novel Attentive Network for Remote Sensing Change Captioning

The proposed Chg2Cap method is constructed by a CNN-based feature extractor that uses the pre-trained ResNet-101 as the backbone, an attentive encoder that consists of a hierarchical self-attention block and a ResBlock, and a caption decoder.

BibTex: