Emerging Cross-lingual Structure in Pretrained Language Models

2023-03-20 22:45:17

訓練兩個單語的bert模型去學習跨語言表達的相似性 Similarity of BERT Models
- Aligning Monolingual BERTs
  
  測量相似性的方法Procrustes

Emerging Cross-lingual Structure in Pretrained Language Models

單語詞對齊 Word-level alignment

把每個子詞當作獨立的輸入，把所有embedding相加然後平均，每一層得到一個embedding。用muse詞表進行監督學習。最後align結果比fasttext好。而且高層的representations比底層好。
雙語詞對齊 Contextual word-level alignment

we can align contextual representations of monolingual BERT models with a simple

linear mapping and use this approach for crosslingual transfer.

中間層比高層representation alignment更好
Sentence-level alignment

pooling subword representation of sentences at each layer of monolingual BERT.
結論

bert在詞級别和句子級别都可以通過簡單的orthogonal mapping來align。

與word embedding相似，bert模型在不同語言是相似的，是以為什麼僅通過共享權重就足夠。
Neural network similarity（沒怎麼看）
CKA entered kernel alignment; CCA canonical correlation analysis
- 結論
當語言相近時，用相同的模型增加representation的相似性

相反，語言不相近時，相同模型不能增加representation相似性

繼續閱讀