In imitation learning for robotics, cotraining with demonstration data generated both in simulation and on real hardware has emerged as a powerful recipe to overcome the sim2real gap and scale up data collection. I will present a set of thorough and focused experiments that elucidate basic principles of this sim-and-real cotraining to inform simulation design, sim-and-real dataset creation, and policy training. These experiments confirm that cotraining with simulated data can dramatically improve performance in the real world, especially when real data is limited. I will also discuss how different distribution shifts between the real and synthetic datasets affect policy performance and inform simulator design for data generation. Perhaps surprisingly, having some visual domain gap actually helps the cotrained policy. I will conclude by discussing this nuance and other mechanisms that help facilitate positive transfer between sim-and-real.