LV-BERT: Exploiting Layer Variety for BERT

doi:doi:10.57702/5hahv5rs

LV-BERT: Exploiting Layer Variety for BERT

Followers: 0

Organization

No Organization

There is no description for this organization

License

No License Provided

Export

DCAT(rdf/xml) DCAT(xml) DCAT(N3) DCAT(ttl) DCAT(jsonld) DataCite CSL DublinCore BibTex

LV-BERT: Exploiting Layer Variety for BERT

Modern pre-trained language models are mostly built upon backbones stacking self-attention and feed-forward layers in an interleaved order. This paper aims to improve pre-trained models by exploiting layer variety from two aspects: the layer type set and the layer order.

BibTex:

Before browse our site, please accept our cookies policy