This is Part 2 of a multi-part series about how to use git for version control. Part 1 described how to get started with git for small projects. This post will detail how to set up and work with remote repositories. Additional posts may happen as needed.
Remote repositories can really speed up your workflow. Think about every time you’ve wanted to transfer files to a remote computer. My previous methods to accomplish this usually involved either using scp in the terminal, or using a Windows GUI to accomplish the same. In either case, you have to identify which files you want to transfer. This can become a burden when you’re making edits to your code. How do you know if you’re working on the local or remote file? How do you know if the directories are synced? Git solves all of these problems.
To set up remote repositories, there are two steps, one on the local system and one on the remote system. Let’s say you’re working with the same repository from Part 1, which just contained a one-line text file.
You can do these two steps in either order, but let’s start by creating a bare repository on the remote system. Log in over ssh as usual and navigate to the directory where you want to host your repository. From that directory, run the following:
mkdir .git cd .git git init --bare
This creates a “bare” git repository in the .git directory. A bare repository is not meant to be worked in. It is only intended to receive updates from other repositories. Thus, we will be working on our local copies and pushing them to the remote repo.
Before you log off the remote system, there is one more confusing but necessary step. By default, you can push updates to this bare repo, but the code won’t be copied to any place where you can actually use it. To fix this, stay in your .git directory and create the file hooks/post-receive. As you might have guessed, this file will contain instructions for git to perform after it receives updates over the network. In this file, copy the following:
#!/bin/sh GIT_WORK_TREE=./../ git checkout -f
(After you save the file, run chmod +x hooks/post-receive to make sure it has execute permissions).
Basically, we are telling git to take the files it just received and check them out to the directory specified by GIT_WORK_TREE, which in this case is just one level up from here. Bare repositories by definition do not have a working tree, so you will need to preface most git commands by setting the work tree as shown here. For example, this applies if you want to switch branches in the remote repository.
That’s all you need to do to set up the remote repository. Now back on your local system, navigate to the directory containing your repository and run:
git remote
The remote command will list all remotes associated with this repository. In this case we don’t have any yet, so let’s add the one we just created.
git remote add <repo-name> <user>@<host>:<path/to/file/.git>
Notice a few things about this command. First, repo-name can be called whatever you want, but you should probably name it based on the location of your remote repository. For example if your remote is located on the PSU clusters, you might call it psu. The second part of the command tells git to transfer files over ssh, using the standard user@host format. The path specified after the : symbol tells git where to find the repository you just created on the remote system.
This last part is important. Git will let you “add” remote repositories without checking whether they exist or not. This check will be performed, however, when you try to push updates. It’s up to you to make sure your path is specified correctly, and that the repository is set up already. That’s why we did that step first.
To see a usable example of adding a remote repo, I would do something like this:
git remote add psu jdh33@cyberstar.psu.edu:~/work/myDirectory/.git
The nice thing about the PSU clusters is that the distributed filesystem will still be available regardless of which cluster (Cyberstar, Lion-XO, etc.) you listed in your add remote command. Now that you’ve created a remote repository, you should be able to run:
git remote -v
and see your remote listed. (The -v flag toggles verbose output, which helpfully includes the host/path of the repository). At this point, you’re ready to push your code to the remote system:
git push <repo-name> <branch-name> (optional)
The optional branch-name parameter is used if you only want to push one branch at a time. I have found that for the initial push, I’ve needed to modify the command as follows (example):
git push psu +master:refs/heads/master
which specifies the location of the branch. You may need to do this every time you push a new branch for the first time. But after you do this the first time, you should be able to just run:
git push psu
to push the whole repository (all branches) to the remote system. Since pushing is set up to work via ssh, you may be prompted for your remote password when you run these commands unless you have ssh keys enabled (maybe a topic for another post). By the way, git push is another great thing to alias — “gp“, maybe?
Now if you go check your remote repository, all of your files should be there! The cool thing about this is that the files themselves are not sent over the network — only the changes to the files are sent. (In git lingo, they’re called “deltas”). This results in fast, efficient transfer, without having to figure out which files changed, which need to be updated, etc., since git handles all of this for you.
A last note: you can have as many remotes as you want! Just add new ones using the git remote add command discussed above. This is especially helpful if you need to push changes to multiple remote systems, one after the other. Hopefully you can see the possibilities here; I’m just discovering them myself. Your mileage may vary, but this has really increased my efficiency in using the clusters. You can do all of your editing locally, and then push the changes wherever you need to very quickly.
This post has only covered one-way pushing to remote repositories. Of course you can pull from them, too! This back-and-forth sharing is the heart of standard git usage (distributed development), which would be a good topic for a future post.