Joe Maller.com

A web-focused Git workflow

After months of looking, struggling through Git-SVN glitches and letting things roll around in my head, I’ve finally arrived at a web-focused Git workflow that’s simple, flexible and easy to use.

Some key advantages:

  • Pushing remote changes automatically updates the live site
  • Server-based site edits won’t break history
  • Simple, no special commit rules or requirements
  • Works with existing sites, no need to redeploy or move files

Overview

The key idea in this system is that the web site exists on the server as a pair of repositories; a bare repository alongside a conventional repository containing the live site. Two simple Git hooks link the pair, automatically pushing and pulling changes between them.

The two repositories:

  • Hub is a bare repository. All other repositories will be cloned from this.
  • Prime is a standard repository, the live web site is served from its working directory.

Using the pair of repositories is simple and flexible. Remote clones with ssh-access can update the live site with a simplegit push to Hub. Any files edited directly on the server are instantly mirrored into Hub upon commit. The whole thing pretty much just works — whichever way it’s used.

Getting ready

Obviously Gitis required on the server and any local machines. My shared web host doesn’t offer Git, but it’s easy enough to install Git yourself.

If this is the first time running Git on your webserver, remember to setup your global configuration info. I set a different Git user.name to help distinguish server-based changes in project history.

$ git config --global user.name "Joe, working on the server"

Getting started

The first step is to initialize a new Git repository in the live web site directory on the server, then to add and commit all the site’s files. This is the Prime repository and working copy. Even if history exists in other places, the contents of the live site will be the baseline onto which all other work is merged.

$ cd ~/www
$ git init
$ git add .
$ git commit -m"initial import of pre-existing web files"

Initializing in place also means there is no downtime or need to re-deploy the site, Git just builds a repository around everything that’s already there.

With the live site now safely in Git, create a bare repository outside the web directory, this isHub.

$ cd; mkdir site_hub.git; cd site_hub.git
$ git --bare init
Initialized empty Git repository in /home/joe/site_hub.git

Then, from inside Prime’s working directory, add Hub as a remote and push Prime’s master branch:

$ cd ~/www
$ git remote add hub ~/site_hub.git
$ git remote show hub
* remote hub
  URL: /home/joe/site_hub.git
$ git push hub master

Hooks

Two simple Git hooks scripts keep Hub and Prime linked together.

An oft-repeated rule of Git is to never push into a repository that has a work tree attached to it. I tried it, and things do get weird fast. The hub repository exists for this reason. Instead of pushing changes to Prime from Hub, which wouldn’t affect the working copy anyway, Hub uses a hook script which tells Prime to pull changes from Hub.

post-update – Hub repository

This hook is called when Hub receives an update. The script changes directories to the Prime repository working copy then runs a pull from Prime. Pushing changes doesn’t update a repository’s working copy, so it’s necessary to execute this from inside the working copy itself.

#!/bin/sh

echo
echo "**** Pulling changes into Prime [Hub's post-update hook]"
echo

cd $HOME/www || exit
unset GIT_DIR
git pull hub master

exec git-update-server-info

post-commit – Prime repository

This hook is called after every commit to send the newly commited changes back up to Hub. Ideally, it’s not common to make changes live on the server, but automating this makes sure site history won’t diverge and create conflicts.

#!/bin/sh

echo
echo "**** pushing changes to Hub [Prime's post-commit hook]"
echo

git push hub

With this hook in place, all changes made to Prime’s master branch are immediately available from Hub. Other branches will also be cloned, but won’t affect the site. Because all remote repository access is via SSH urls, only users with shell access to the web server will be able to push and trigger a site update.

Conflicts

This repository-hook arrangement makes it very difficult to accidentally break the live site. Since every commit to Prime is automatically pushed to Hub, all conflicts will be immediately visible to the clones when pushing an update.

However there are a few situations where Prime can diverge from Hub which will require additional steps to fix. If an uncommitted edit leaves Prime in a dirty state, Hub’s post-update pull will fail with an “Entry ‘foo’ not uptodate. Cannot merge.” warning. Committing changes will clean up Prime’s working directory, and the post-update hook will then merge the un-pulled changes.

If a conflict occurs where changes to Prime can’t be merged with Hub, I’ve found the best solution is to push the current state of Prime to a new branch on Hub. The following command, issued from inside Prime, will create a remote “fixme” branch based on the current contents of Prime:

$ git push hub master:refs/heads/fixme

Once that’s in Hub, any remote clone can pull down the new branch and resolve the merge. Trying to resolve a conflict on the server would almost certainly break the site due to Git’s conflict markers.

Housekeeping

Prime’s .git folder is at the root level of the web site, and is probably publicly accessible. To protect the folder and prevent unwanted clones of the repository, add the following to your top-level .htaccess file to forbid web access:

# deny access to the top-level git repository:
RewriteEngine On
RewriteRule ^.git - [F,L]

Troubleshooting

If you’re seeing this error when trying to push to a server repository:

git-receive-pack: command not found
fatal: The remote end hung up unexpectedly

Add export PATH=${PATH}:~/bin to your .bashrc file on the server. Thanks to Robert for finding and posting the fix.

Links

These didn’t fit in anywhere else:


45 Responses to “A web-focused Git workflow”

  • Why not use some git web interfaces, like gitweb or cgit?

  • @Jakub Unless I’m mistaken, Gitweb, cgit and similar projects are for hosting repositories while making source code visible on the web. I’m using Git to publish and manage web sites where the codebase isn’t and often shouldn’t be publicly visible. The Git web interfaces provide access to files and history in a repository, this workflow focuses instead on the contents of files in a repository.

  • This is very interesting, I manage a lot of sites (on shared hosting) and I been wanting to start using some kind of version control and I’ve been looking at git. But since I’m new to both version control in general and git there’s a lot to learn. This seems like a good starting point to experiment with.

  • Great post. I’m about to set up something similar using Mercurial, but most of the concepts with different DVCS are the same.

  • This is exactly what I needed, a reasonable way to keep git synced up to the website and get changes back from the website into git, thank you very much.

  • If you didn’t want to use a rewrite rule, you could also just deny access to .git folder.

    Order deny,allow
    deny from all

  • Very useful, I currently use something like this but with SVN. I’m switching to Git now so I’ll give your tutorial a try. Thanks!

  • Interesting, i’ve used a similar setup with SVN but as my project got bigger it was too slow to be practical.

  • Very Interesting. I have a similar workflow for Oddmuse wiki engine to/from git : http://www.foo.be/cgi-bin/wiki.pl/OddmuseGit .

  • Interesting article. Although it’s certainly easy to look it up ourselves, it could have been handy to include directions on how to set up hooks and to explain a bit what git-update-server-info does.

  • Just thought I’d mention that you can do a similar thing to this using GitHub as the “hub” element above, except that the hook is an HTTP callback (http://github.com/guides/post-receive-hooks).

  • Great write-up. Here’s my method of accomplishing the same thing: http://dmiessler.com/blog/using-git-to-maintain-your-website

  • Thanks Joe. I’m going to try this out as well. This example should get me started on getting a deploy strategy that will work for my setup.

  • Kickass, this works great. Been looking for this for a while…

  • Hey this is great.

    Has anyone considered committing the server DB to another repo & using hooks to version the DB in sync with the commits on the Prime repo?

  • @Drew: I don’t think versionning the DB file(s) is a good idea. There are tools ( migrations ) to accomplish this (at least for the db scheme) and they on the other hand are files you can version.

  • I finally got this working, thanks joe! Also I should note that I had to chmod post-commit and post-update to executable

    chmod +x /path_to/post-commit
    chmod +x /path_to/post-update

    • @Michael, the activation method for hooks changed with the Git 1.6.0 release. If you’re using an older version, you won’t have to rename the script, but as you point out, you will need to make it executable.

  • Hi, Thanks for this great article. Our websites are often talking to different systems such as DBs, LDAP etc. we cannot sync back directly to our PRIME from HUB and rather auto push changes to our TEST environment first then manually clone WWW-TEST to WWW-PROD. Even though your code might seem to be OK in HUB it doesn’t guarantee it’ll work with other systems (e.g. you forgot to reflect code changes on those systems). I stick to the golden rule “Always test before sending to production”.

    test prod
    SYSTEMS SYSTEMS
    | |
    WWW-TEST -> WWW-PROD
    \
    HUB
    / | \
    dev team

    Cheers, D

  • Great tutorial. I’m quite a newbie on all of this. I’m up to the Hooks section. I don’t know how to create the shell scripts and get them to automatically execute? Could anyone point me in the right direction pls?
    Cheers :)

    • ps: I’m using Dreamhost shared hosting (if that helps).

  • @Fabian: I had the same question, and discovered that you put the hooks files in the .git/hooks directory of the repo you want it to run on.

    • haha yea, I managed to find it on the git manual page *chuckle*

  • Nice tutorial. It would be great if you added one more section: how to clone the remote repo on your local machine (git clone ssh://yoursite.com/~/site_hub.git). I was getting the “fatal: The remote end hung up unexpectedly” error when trying to execute the clone command, until I realized I was not specifying the full path on the remote server filesystem (the tilde referring to my home directory was crucial).

  • Great tutorial – I’ve been using your method for a while and am very happy with it. I have a question though about a new way I’m trying to use it which is causing me headaches (it’s probably because I don’t quite get git completely yet). Basically I’m working on a branch locally to implement some new functionality. And I’d like for this branch to be checked out to a different directory on my server (which is served up as a seperate subdomain). How would I go about implementing this?

    Thanks!

    • @Kelvin,

      What you describe sounds a bit SVN-ish. A more Git-style solution, as opposed to just flipping branches, might be a local clone of your repository where you could create your alternate branch.

  • I’m in the process of setting up something along those line, but I was wondering what the point of the “hub” is exactly… Couldn’t just clone the live site directly? Is there any problem doing that? Thanks!

  • nevermind… should have read the whole post first ;) thanks!

  • Thanks a ton!! This really helped me out a lot :).

  • Is there any magical way to make any change to files auto-commit? I have some artist that use FTP to upload content to the site and I was curious if there was some nifty way to have GIT auto-commit on any change in Prime.

    • The watch command can keep track of directory changes, or else just use cron to add and commit the directory regularly, if there are no changes there won’t be a commit.

  • Does this script work if the Prime repository is owned by a different user than the one that owns Hub?

  • If “git push” would work properly for non-bare destinations (as it is the case with e.g. Mercurial and also Darcs), the design could be simplified by merging “hub” and “prime”.

  • Nice post.
    I do something similar for work in progress previews.
    The difference is I push to a preview branch and the post update hook does:

    echo "PostUpdate Hook"
    echo $1 $2 $3

    case " $* " in
    *' refs/heads/preview '*)
    /home/client/bin/update-repo.sh
    ;;
    esac

    exec git-update-server-info

    The update-repo.sh is similar to your hook’s body.

    • Somewhat embarrassingly, I had completely missed the various hook parameters until reading your comment, I’m definitely going to be experimenting with these. Thanks!

      For reference, post-update hook documentation: “[Post-update] takes a variable number of parameters, each of which is the name of ref that was actually updated.”

      • Glad to be of help :)

        I find that pushing to a specific branch lets me use Hub to share with other devs and the design team without triggering deploys or setting yet another bare repo.
        I will incorporate triggering a deploy to production soon.

        Your trick to resolve conflicts in the Prime repo through a brach is neat, will be using that, too.

  • This page doesn’t seem to print very well. I have only tried in Firefox and Opera. I think it is down to the following in style.css:

    #main{

    overflow: hidden;

    }

    It seems to print okay when I disable this using Firebug or Dragonfly.

  • Hey Great article!

    I have an issue that when I pull to my webserver I can no longer edit/delete those files via FTP.

    Does Git lock them somehow?

    Thanks

    Paul

  • How are you testing if the changes you make are working and the site works the way you want it until pushing to Prime? Is your local working copy in a folder on a test server that you can run the site locally to test?

    • @darrell Yes, my local clone lives in a functional working environment which closely mirrors the live server’s configuration. I’ve also used a variation on this to have a parallel dev server for testing where my local branches push to different remotes. It’s very, very flexible and extremely forgiving.

  • If I wanted to have a staging version of the site, could I just add a “cd ~/stage/” and “git pull hub staging_branch” to the post-update hook? As well as a “cd ~/live/” & “git pull hub production_branch” for the live site?

    I think that would enable me to have a staging branch and a production branch both updated with the required changes.

  • I’m having a problem with this. The prime repo (the working tree) is getting the wrong permissions, it isn’t group writeable despite being set as a shared repository. Any ideas?

    • I haven’t run into that problem (yet?). Have you tried tweaking the the core.sharedRepository setting in Git Config?

  • Great post. Exactly what I needed. Thank you!

  • Thank you for this tutorial! It is going to save me alot of time and headaches. I followed the tutorial and have everything working.

    Like some others have mentioned in previous comments, our organization also like to develop locally then push to a test or staging repo.

    When I start to think about how to integrate this functionality, my head starts to spin. Any chance someone could break it down for me?

    Thanks again!

Leave a Reply