Abstract: Unsupervised Domain Adaptation aims to leverage a source domain with ample labeled data to tackle tasks on an unlabeled target domain. However, this poses a significant challenge, particularly in scenarios exhibiting significant disparities between the two domains. Prior methods often fall short in challenging domains due to the impact of incorrect pseudo-labeling noise and the limits of handcrafted domain alignment rules. In this paper, we propose a novel method called DCST (Dual Cross-Supervision Transformer), which improves upon existing methods in two key aspects. Firstly, vision transformer is combined with dual cross-supervision learning strategy to enforce consistency learning from different domains. The network accomplishes domain-specific self-training and cross-domain f