... Other studies incorporate a character-level CNN (Ma and Hovy, 2016), global contexts , or language models Peters et al., 2017Peters et al., , 2018Devlin et al., 2018) to improve name tagging. In addition, several approaches (Zhang et al., 2016a(Zhang et al., , 2017aAl-Badrashiny et al., 2017) attempt to incorporate hand-crafted linguistic features into a Bi-LSTM-CRF to improve low-resource name tagging performance. Recent attempts on cross-lingual transfer for name tagging can be divided into two categories: the first projects annotations from a source language to a target language via parallel corpora (Yarowsky et al., 2001;Zhang et al., 2016b;Fang and Cohn, 2016;Ehrmann et al., 2011;Enghoff et al., 2018;Ni et al., 2017), a bilingual gazetteer (Feng et al., 2017;Zirikly and Hagiwara, 2015), Wikipedia anchor links (Kim et al., 2012;Nothman et al., 2013;Tsai et al., 2016;, and language universal representations, including Unicode bytes (Gillick et al., 2016) and cross-lingual word embeddings (Fang and Cohn, 2017;Wang et al., 2017;Xie et al., 2018). ...