Ever need to create more space on your Github private repositories? (Yes.) Want an automated method to clear out the projects you’re least likely to need archived? (Double yes.)
Github is great for large teams like us. It provides a collaboration space that includes project issue tracking, automatic pull request merges with the click of a button, and project-specific wikis.
We open source a lot of stuff, and Github is great because it allows individuals and organizations unlimited public repositories. They do this to foster open source software whenever possible. You only have to start paying for your account when you want private repositories. Atlassian’s BitBucket service takes a different approach–they offer unlimited public and private repositories and you start paying based on your team size. But I prefer Github’s user interface and development community.
Github’s pricing plans are awesome for a company that builds a specific product. But what about those of us building multiple products and writing custom software for others. We need source control on 100% of our projects, and we have multiple projects going on at once. We also have fairly high project turnover (many flowing in and out of the shop at one time) and lots of small, short-term projects. Once we’re finished with an engagement with a client, we move on to the next project on the list. Our business makes for an interesting environment as far as learning and adapting to new projects, scenarios, clients, and products, but it also means that something has to be done when project repositories grow stale after the project is done.
We can’t just delete the source code (and it’s accompanying history). We have to have due diligence to protect and archive client work, even if we believe we might not work on that project again in the future. That’s where my git archive script comes into play. Whenever we need space on our github account because we’ve reached our private repository limit, I run this script and remove a few old projects.
Here’s how it works:
You run the command through terminal, passing a git remote url (doesn’t have to be Github–any address that you can clone will work): $ ./gitarchive.sh firstname.lastname@example.org:Skookum/base12.git
Then the script goes to work. It starts off by creating a local clone of the repos including all submodules.
It then tracks all remote branches and pulls all remote tags.
After it gets everything on your local machine, it runs git gc for good measure to make sure we have an optimized repository.
Finally, it creates a tar file in the format of [REPOS_NAME].gitarchive.YYYYMMDD.tgz and deletes the folder that held the clone.
After you’ve secured this git archive on a backup server somewhere, you can then go into the remote repos administration area and delete the project, thus making room for new ones!
A few things to note:
Always verify that you can untar the file before deleting the remote repository.
Always check with other team members before deleting what you think may be an unused repository.
Here’s the script in all its glory. Feel free to fork it, improve on it, and please let me know in the comments if you use the script!