We work with high sampling resolutions and hundreds of thousands of little files. This generates a lot of problems for saving the database which becomes very long as I have asked about in the past (partial solution was modify the save function to only save manually and intermittently), but also for organizing workflow when we have multiple research questions on the same dataset. We have experimented with making copies of the DB and deleting the bits that are not needed for each analysis, but this means we have multiple copies of some of the data, and since our DBs are in the 4TB range it's a problem.
Ideally we'd have something like versions of a database that somehow link back to or share common files and only save/display the files relevant for an analysis, but I think that is not possible. I am wondering if you have any suggestions for how to better manage this problem?