Hi @zedian I see that in model_utils.py (L285-287), the Sinkhorn loss is commented out. Isn't this the loss used in training in the paper ?
# alignment_loss = self.earth_mover_loss(
# self.get_weight(attention_mask), hidden_mat, self.get_weight(original_mask.reshape(attention_mask.shape)), original_embedding
# )
Hi @zedian I see that in model_utils.py (L285-287), the Sinkhorn loss is commented out. Isn't this the loss used in training in the paper ?