Everyone uses version control for software, but it’s much less common in machine learning.
This causes all sorts of problems: people are manually keeping track of things in spreadsheets, model weights are scattered on S3, and results can’t be reproduced. Somebody who wrote a model has left the team? Bad luck – nothing’s written down and you’ve probably got to start from scratch.
So why isn’t everyone using Git? Git doesn’t work well with machine learning. It can’t handle large files, it can’t handle key/value metadata like metrics, and it can’t record information automatically from inside a training script. There are some solutions for these things, but they feel like band-aids.
We spent a year talking to people in the ML community about this, and this is what we found out:
- We need a native version control system for ML. It’s sufficiently different to normal software that we can’t just put band-aids on existing systems.
- It needs to be small, easy to use, and extensible. We found people struggling to migrate to “AI Platforms”. We believe tools should do one thing well and combine with other tools to produce the system you need.
- It needs to be open source. There are a number of proprietary solutions, but something so foundational needs to be built by and for the ML community.
We need your help to make this a reality. If you’ve built this for yourself, or are just interested in this problem, join us to help build a better system for everyone.
Join our Discord chat or Get involved on GitHub