Towards Augmenting And Evaluating Large Language Models