HSViT: Horizontally Scalable Vision Transformer

This paper introduces a horizontally scalable vision transformer (HSViT) scheme with a novel image-level feature embedding. The design of HSViT preserves the inductive bias from convolutional layers while effectively reducing the number of layers and parameters of the models.

Data and Resources

Cite this as

Chenhao Xu, Chang-Tsun Li, Chee Peng Lim, Douglas Creighton (2024). Dataset: HSViT: Horizontally Scalable Vision Transformer. https://doi.org/10.57702/q8ln2hj8

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Author Chenhao Xu
More Authors
Chang-Tsun Li
Chee Peng Lim
Douglas Creighton
Homepage https://github.com/xuchenhao001/HSViT