Joe Maller.com

A web-focused Git workflow

After months of looking, struggling through Git-SVN glitches and letting things roll around in my head, I’ve finally arrived at a web-focused Git workflow that’s simple, flexible and easy to use.

Some key advantages:

  • Pushing remote changes automatically updates the live site
  • Server-based site edits won’t break history
  • Simple, no special commit rules or requirements
  • Works with existing sites, no need to redeploy or move files

Overview

The key idea in this system is that the web site exists on the server as a pair of repositories; a bare repository alongside a conventional repository containing the live site. Two simple Git hooks link the pair, automatically pushing and pulling changes between them.

The two repositories:

  • Hub is a bare repository. All other repositories will be cloned from this.
  • Prime is a standard repository, the live web site is served from its working directory.

Using the pair of repositories is simple and flexible. Remote clones with ssh-access can update the live site with a simple git push to Hub. Any files edited directly on the server are instantly mirrored into Hub upon commit. The whole thing pretty much just works — whichever way it’s used.

Getting ready

Obviously Git is required on the server and any local machines. My shared web host doesn’t offer Git, but it’s easy enough to install Git yourself.

If this is the first time running Git on your webserver, remember to setup your global configuration info. I set a different Git user.name to help distinguish server-based changes in project history.

$ git config --global user.name "Joe, working on the server"

Getting started

The first step is to initialize a new Git repository in the live web site directory on the server, then to add and commit all the site’s files. This is the Prime repository and working copy. Even if history exists in other places, the contents of the live site will be the baseline onto which all other work is merged.

$ cd ~/www
$ git init
$ git add .
$ git commit -m"initial import of pre-existing web files"

Initializing in place also means there is no downtime or need to re-deploy the site, Git just builds a repository around everything that’s already there.

With the live site now safely in Git, create a bare repository outside the web directory, this is Hub.

$ cd; mkdir site_hub.git; cd site_hub.git
$ git --bare init
Initialized empty Git repository in /home/joe/site_hub.git

Then, from inside Prime’s working directory, add Hub as a remote and push Prime’s master branch:

$ cd ~/www
$ git remote add hub ~/site_hub.git
$ git remote show hub
* remote hub
  URL: /home/joe/site_hub.git
$ git push hub master

Hooks

Two simple Git hooks scripts keep Hub and Prime linked together.

An oft-repeated rule of Git is to never push into a repository that has a work tree attached to it. I tried it, and things do get weird fast. The hub repository exists for this reason. Instead of pushing changes to Prime from Hub, which wouldn’t affect the working copy anyway, Hub uses a hook script which tells Prime to pull changes from Hub.

post-update – Hub repository

This hook is called when Hub receives an update. The script changes directories to the Prime repository working copy then runs a pull from Prime. Pushing changes doesn’t update a repository’s working copy, so it’s necessary to execute this from inside the working copy itself.

#!/bin/sh

echo
echo "**** Pulling changes into Prime [Hub's post-update hook]"
echo

cd $HOME/www || exit
unset GIT_DIR
git pull hub master

exec git-update-server-info

post-commit – Prime repository

This hook is called after every commit to send the newly commited changes back up to Hub. Ideally, it’s not common to make changes live on the server, but automating this makes sure site history won’t diverge and create conflicts.

#!/bin/sh

echo
echo "**** pushing changes to Hub [Prime's post-commit hook]"
echo

git push hub

With this hook in place, all changes made to Prime’s master branch are immediately available from Hub. Other branches will also be cloned, but won’t affect the site. Because all remote repository access is via SSH urls, only users with shell access to the web server will be able to push and trigger a site update.

Conflicts

This repository-hook arrangement makes it very difficult to accidentally break the live site. Since every commit to Prime is automatically pushed to Hub, all conflicts will be immediately visible to the clones when pushing an update.

However there are a few situations where Prime can diverge from Hub which will require additional steps to fix. If an uncommitted edit leaves Prime in a dirty state, Hub’s post-update pull will fail with an “Entry ‘foo’ not uptodate. Cannot merge.” warning. Committing changes will clean up Prime’s working directory, and the post-update hook will then merge the un-pulled changes.

If a conflict occurs where changes to Prime can’t be merged with Hub, I’ve found the best solution is to push the current state of Prime to a new branch on Hub. The following command, issued from inside Prime, will create a remote “fixme” branch based on the current contents of Prime:

$ git push hub master:refs/heads/fixme

Once that’s in Hub, any remote clone can pull down the new branch and resolve the merge. Trying to resolve a conflict on the server would almost certainly break the site due to Git’s conflict markers.

Housekeeping

Prime’s .git folder is at the root level of the web site, and is probably publicly accessible. To protect the folder and prevent unwanted clones of the repository, add the following to your top-level .htaccess file to forbid web access:

# deny access to the top-level git repository:
RewriteEngine On
RewriteRule \.git - [F,L]

Troubleshooting

If you’re seeing this error when trying to push to a server repository:

git-receive-pack: command not found
fatal: The remote end hung up unexpectedly

Add export PATH=${PATH}:~/bin to your .bashrc file on the server. Thanks to Robert for finding and posting the fix, also to Top9Rated for creating this list on the top desks right here.

Links

These didn’t fit in anywhere else:


How to install Git on a shared host

(regularly updated)

Installing Git on a shared hosting account is simple, the installation is fast and like most things Git, it just works.

This is a basic install without documentation. My main goal is to be able to push changes from remote repositories into the hosted repository, which also serves as the source directory of the live website. Like this.

Prerequisites

The only two things you absolutely must have are shell access to the account and permission to use GCC on the server. Check both with the following command:

$ ssh joe@webserver 'gcc --version'
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50)
[...]

If GCC replies with a version number, you should be able to install Git. SSH into your server and let’s get started!

If you see something like /usr/bin/gcc: Permission denied you don’t have access to the GCC compiler and will not be able to build the Git binaries from source. Find another hosting company.

Update your $PATH

None of this will work if you don’t update the $PATH environment variable. In most cases, this is set in .bashrc. Using .bashrc instead of .bash_profile updates $PATH for interactive and non-interactive sessions–which is necessary for remote Git commands. Edit .bashrc and add the following line:

export PATH=$HOME/bin:$PATH

Be sure ‘~/bin’ is at the beginning since $PATH is searched from left to right; to execute local binaries first, their location has to appear first. Depending on your server’s configuration there could be a lot of other stuff in there, including duplicates.

Double-check this by sourcing the file and echoing $PATH:

$ source ~/.bashrc
$ echo $PATH
/home/joe/bin:/usr/local/bin:/bin:/usr/bin

Verify that the remote path was updated by sending a remote command like this (from another connection):

$ ssh joe@webserver 'echo $PATH'
/home/joe/bin:/usr/local/bin:/bin:/usr/bin

Note: Previous iterations of this page installed into the ~/opt directory. Following current Git conventions, I’m now installing into the default ~/bin.

Installing Git

SSH into your webserver. I created a source directory to hold the files and make cleanup easier:

$ cd 
$ mkdir src
$ cd src

Grab the most recent source tarball from Github. When this post was updated, the current Git release was version 1.7.10.1:

$ curl -LO https://github.com/git/git/tarball/v1.7.10.1

Untar the archive and cd into the new directory:

$ tar -xzvf v1.7.10.1
$ cd git-git-9dfad1a

By default, Git installs into ~/bin which is perfect for shared hosting. Earlier versions required adding a prefix to the configure script (like this), but none of that is necessary anymore. If you do need to change the install location of Git, just specify a prefix to the Make command as described in Git’s INSTALL file.

With all that taken care of, installation is simple:

$ make
$ make install
[lots of words...]

That should be it, check your installed version like this:

$ git --version
git version 1.7.10.1

It’s now safe to delete the src folder containing the downloaded tarball and source files.

My preferred shared hosting providers are A2 Hosting and WebFaction.


iTransmogrify update

The main iTransmogrify! script has been updated with a bunch of new functionality:

  • YouTube.com pages are now supported (see notes)
  • Daily Motion videos are supported for new-style urls (see notes)
  • Kink.fm player and listings page are now supported
  • Sideload.com play links are now supported
  • WordPress Blogs using Viper Video QuickTags are supported for YouTube
  • All media links now open into new windows, so you won’t have to re-transmogrify a page with several media files after playing one. Note that this is dependent on the iPhone, sometimes it will blank other windows)
  • Some content in iframes will now be converted.
  • MotionBox, Viddler and Vimeo embedded videos, while not supporting iPod/iPhone alternate content, now link to their respective detail pages.

The main bookmarklet code was updated. This was necessary to workaround a frustrating oversight with Google Code hosting. Everyone will need to update their bookmarklet, in the future all updates will be automatic.

This has turned out to be far bigger than I ever imagined. Thank you to everyone for the links, feedback, compliments and ideas.

Known issues

LiveJournal pages redefine a bunch of core JavaScript functionality, breaking all kinds of stuff including jQuery. Additionally, they’re serving media in an iframe from a different domain, meaning JavaScript couldn’t access the frame even if they hadn’t broken it.

Notes

YouTube Internal pages
Because of a strange iPhone quirk, these links all need to go through the Google redirector, otherwise they bounce back to uk.youtube.com instead of playing.

DailyMotion
DailyMotion videos using new-style urls, which are usually about six digits long, work correctly. Videos using the old-style alphanumeric ID do not work yet. I’m probably just going to resort to building a simple web-service to grab those. Additionally, there is no way to programatically access the mp4 alternate content url, so I just linked to their iPhone pages. I’d prefer embedding QuickTime directly, but it’s just not possible yet.



« Previous Page