We generate enormous amount of data everyday working with virtual machines. Several file are so important that one cant afford to loose them. But what if, oneday you realize that you virtual machine has crashed due to some reason and all your data is gone forever. Seems like a nightmare that no-one wanna encounter.
Most of us take file backup on github, but doing it manually all the time is so cumbersome.
In this post, I am gonna show you how can take your important folder backups on git using cron jobs in linux. The cron can be configured to run every hour so that all your important work is safely backed up.
Let’s get started!!
Create a new repository
Create a new repository on github (it can be public or private)
The next page will ask you to do a quick setup. Switch the tab to ssh url.
Go to your linux machine and install git using:
# apt install git
Now got to the parent folder, whose backup has to be taken on git and run the following set of commands.
# git init # git add README.md # git commit -m "first commit" # git remote add origin git@github.com:shreya2201/my-repo.git # git push -u origin master
This will create a .git folder inside that folder. And we are trying to do our first commit. We the git remote add command is run, the user is asked to enter the username and password of the github account. So when automating, we need to get rid of it. This is where ssh keys are handy.
Configure ssh keys
Since we are atomating the process of backup, we need ssh keys to login to our git account. And as a first step, ssh keys have to be generated. This can be done using ssh-keygen
# ssh-keygen -t rsa -b 4096 -C "your mail id attached on github"
The keys generated will be stored inside /root/.ssh with names id_rsa_git and id_rsa_git.pub
- cat the contents of id_rsa_git.pub and copy to clipboard.
- Go to github.com and login with your creds.
- Got to your account settings and click on ssh and gpg keys.
Paste the public key (copied earlier) here and give your key a name. ssh key s can be specifically configured for individual repos as well. But from here, the keys will work with any repo you want.
SSH configuration
Now again get back to your linux machine and setup the ssh configuration file.
open /root/.ssh/config file. Create config inside .ssh, it is not present. Put the following contents to the file
host is github.com and path to private key has to be given.
To test whether the ssh keys are working or not, run the following:
Cool! I am now able to authenticate. Next up, we need to set the url to ssh git@github.com:username/my-repo.git( the default url is set to https). Run git push command on the folder to check if everything is working fine!!
⚡ root@kali ~/Desktop/htb > git remote set-url origin git@github.com:username/my-repo.git
⚡ root@kali ~/Desktop/htb > git push origin master
I am able to push to git without requiring any password, so the cron job can be set up now.
Create a script commit.sh inside /root and add the following contents
# Git: add and commit changes
cd /root/Desktop/htb && /usr/bin/git add --all && /usr/bin/git commit -m "daily crontab backup `date`"
# send data to Git server
cd /root/Desktop/htb && /usr/bin/git push origin master
On the terminal run
# chmod +x /root/commit.sh
To create a cron job, run crontab -e and add the following contents to the file.
# crontab -e */30 * * * * /root/commit.sh
In a nutshell, this cron job will be running every 30 minutes and forever.
Make sure that the cron service is running. If not, run
# systemctl start cron # systemctl enable cron
If file size is more than 100 Mb
There might be instances when the file inside your folder has size greater than 100Mb. Presence of such files can lead to failed git push.
Error will look something like this
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com. remote: error: Trace: 204cde5b555213065bf8a915ca39bcda remote: error: See http://git.io/iEPt8g for more information. remote: error: File node/hydra.restore is 147.14 MB; this exceeds GitHub's file size limit of 100.00 MB
Here, hydra.restore is a file inside folder node that is causing the error. So here’s a trick you can use to get rid of errors and do a successful git push
# git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch node/hydra.restore'
Thats all for the blog post!! I hope you got what you needed! For much content, visit here
This is cool. Automating stuff always helps.