The Vision, Language, and Learning Group (VL2G) at the Indian Institute of Technology Jodhpur is a group of researchers and students led by Anand Mishra. The group addresses core and applied vision and language tasks by developing AI models that have the ability to acquire world and commonsense knowledge and use that knowledge to reason about the visual world.