Domain Robustness In Multi-Modality Learning And Visual Question Answering