Machine-learning on dirty data in Python: a tutorial