Joe Maller.com

A more Git-friendly WordPress

So a few months ago I mentioned wanting to get away from WordPress and PHP. It’s not going very well.

WordPress keeps sucking me back in. A favor here, a quick job there. Next thing I know, I swear I can’t remember how to iterate an array in Ruby or Python.

While sitting down to work on a WordPress project still fills me with dread, I did recently discover a few things which slightly alleviate my misery.

My favorite, as described here by David Winter, is the ability to move the wp-content directory out of the standard WordPress hierarchy. Aside from the database, wp-content holds basically everything which makes a site unique; themes, plugins, uploads, etc. With those out of the way, all of the core WordPress application code can be removed from the site’s git repo and stored as a submodule (pulling from the WordPress GitHub mirror), making version control a lot cleaner and easier and giving me one less thing to think about.

This directory layout should really be the default. The WordPress folder ought be a sacrosanct library, only changing when the application is upgraded. The ability to move wp-content was added back in version 2.6 released in July of 2008. I wish I’d learned about this sooner.

I’m also doing something inspired by Mark Jaquith’s WordPress local dev tips which also allows me to also keep my wp-config.php file versioned and outside of the wordpress directory.

Because it’s a really bad idea to keep password files in version control, I created a wp-config-db-sample.php file containing placeholders for the database login information:

That file gets copied to wp-config-db.php, populated with the appropriate settings (and added to .gitignore), then included by changing the top of wp-config.php like this:


Faster and easier Gitweb installation

The idea of using make to build Gitweb isn’t just excessively complex, it’s also mostly unnecessary. Building gitweb.cgi from gitweb.perl only changes 19 of the source file’s 6734 lines (0.2%).

Fact is, to get Gitweb working only one line needs changing. After the following edit, all local configuration values can be loaded from a simple config file.

On line 546, insert the name of your config file:

-our $GITWEB_CONFIG = $ENV{'GITWEB_CONFIG'} || "++GITWEB_CONFIG++"; 
+our $GITWEB_CONFIG = $ENV{'GITWEB_CONFIG'} || "gitweb_config.perl";

A set of fully-documented configuration files is available in the Simple Gitweb Config project on Github, to help get things up and running quickly.

(more…)


Convert Git branches to remote tracking branches

Update: As of Git 1.7.0, converting existing branches to tracking branches got a whole lot easier. git push now has a -u flag which will set up tracking based on a successful push.

$ git push -u hub master
Branch master set up to track remote branch master from hub.

For reference, here’s the original post:

There are two ways to convert an existing branch to a remote tracking branch, using git config or directly editing the .git/config file.

In both of these examples, the local and remote branches are named “master”. The remote repository is “hub”.

git config commands

$ git config branch.master.remote hub
$ git config branch.master.merge refs/heads/master

editing .git/config

All the git config commands do is add the following to .git/config, editing the file manually has the same result.

[branch "master"]
    remote = hub
    merge = refs/heads/master

What would be nice is an additional config command, branch.<name>.track, which would split a full refspec, sending the relevant parts to the remote and merge commands.

Share |
Leave a comment
link: Dec 10, 2009 1:19 pm
posted in: misc.
Tags:

Git error: index file is too small

This error popped up recently while trying to mirror a git repository onto another server. Attempting to clone the repository yielded hundreds of errors like these two:

./objects/pack/._pack-de7d2e641423ddac38ff369dae6afad9f02d4397.idx is too small
error: index file /home/joe/site/.git/objects/pack/._pack-de7d2e641423ddac38ff369dae6afad9f02d4397.idx is too small

Not a lot has been written about this error, and I don’t make any claims to understanding Git’s internals enough to know whether or not that was a very bad thing or just cosmetic. But playing it safe, I assumed the clone had failed and the repository was compromised.

On my machines, I’m running up to date 1.6.x Git binaries, but the server throwing these errors is running 1.5.4.1. I suspected a version imcompatibility, but googling for “git” and any variant of “version” is epic futility. (hint, google “backwards compatible” instead). Here’s what I found:

Sometime around version 1.5.0, Git’s repository format changed. While the notes indicated the server version of Git should have supported this, a Git development patch and the Git 1.6.0 release notes convinced me to try:

By default, packfiles created with this version uses delta-base-offset
encoding
introduced in v1.4.4. Pack idx files are using version 2 that
allows larger packs and added robustness thanks to its CRC checking,
introduced in v1.5.2 and v1.4.4.5. If you want to keep your repositories
backwards compatible past these versions, set repack.useDeltaBaseOffset
to false
or pack.indexVersion to 1, respectively.

In the local repository’s config, I set repack.usedeltabaseoffset to false and then repacked the repository:

git config repack.usedeltabaseoffset false
git repack -a -d

This appears to have fixed the problem. Cloning the repository worked perfectly and everything seems to be working smoothly now.

Share |
2 Comments so far
link: Mar 08, 2009 2:17 am
posted in: misc.
Tags:

A web-focused Git workflow

After months of looking, struggling through Git-SVN glitches and letting things roll around in my head, I’ve finally arrived at a web-focused Git workflow that’s simple, flexible and easy to use.

Some key advantages:

  • Pushing remote changes automatically updates the live site
  • Server-based site edits won’t break history
  • Simple, no special commit rules or requirements
  • Works with existing sites, no need to redeploy or move files

Overview

The key idea in this system is that the web site exists on the server as a pair of repositories; a bare repository alongside a conventional repository containing the live site. Two simple Git hooks link the pair, automatically pushing and pulling changes between them.

The two repositories:

  • Hub is a bare repository. All other repositories will be cloned from this.
  • Prime is a standard repository, the live web site is served from its working directory.

Using the pair of repositories is simple and flexible. Remote clones with ssh-access can update the live site with a simple git push to Hub. Any files edited directly on the server are instantly mirrored into Hub upon commit. The whole thing pretty much just works — whichever way it’s used.

Getting ready

Obviously Git is required on the server and any local machines. My shared web host doesn’t offer Git, but it’s easy enough to install Git yourself.

If this is the first time running Git on your webserver, remember to setup your global configuration info. I set a different Git user.name to help distinguish server-based changes in project history.

$ git config --global user.name "Joe, working on the server"

Getting started

The first step is to initialize a new Git repository in the live web site directory on the server, then to add and commit all the site’s files. This is the Prime repository and working copy. Even if history exists in other places, the contents of the live site will be the baseline onto which all other work is merged.

$ cd ~/www
$ git init
$ git add .
$ git commit -m"initial import of pre-existing web files"

Initializing in place also means there is no downtime or need to re-deploy the site, Git just builds a repository around everything that’s already there.

With the live site now safely in Git, create a bare repository outside the web directory, this is Hub.

$ cd; mkdir site_hub.git; cd site_hub.git
$ git --bare init
Initialized empty Git repository in /home/joe/site_hub.git

Then, from inside Prime’s working directory, add Hub as a remote and push Prime’s master branch:

$ cd ~/www
$ git remote add hub ~/site_hub.git
$ git remote show hub
* remote hub
  URL: /home/joe/site_hub.git
$ git push hub master

Hooks

Two simple Git hooks scripts keep Hub and Prime linked together.

An oft-repeated rule of Git is to never push into a repository that has a work tree attached to it. I tried it, and things do get weird fast. The hub repository exists for this reason. Instead of pushing changes to Prime from Hub, which wouldn’t affect the working copy anyway, Hub uses a hook script which tells Prime to pull changes from Hub.

post-update – Hub repository

This hook is called when Hub receives an update. The script changes directories to the Prime repository working copy then runs a pull from Prime. Pushing changes doesn’t update a repository’s working copy, so it’s necessary to execute this from inside the working copy itself.

#!/bin/sh

echo
echo "**** Pulling changes into Prime [Hub's post-update hook]"
echo

cd $HOME/www || exit
unset GIT_DIR
git pull hub master

exec git-update-server-info

post-commit – Prime repository

This hook is called after every commit to send the newly commited changes back up to Hub. Ideally, it’s not common to make changes live on the server, but automating this makes sure site history won’t diverge and create conflicts.

#!/bin/sh

echo
echo "**** pushing changes to Hub [Prime's post-commit hook]"
echo

git push hub

With this hook in place, all changes made to Prime’s master branch are immediately available from Hub. Other branches will also be cloned, but won’t affect the site. Because all remote repository access is via SSH urls, only users with shell access to the web server will be able to push and trigger a site update.

Conflicts

This repository-hook arrangement makes it very difficult to accidentally break the live site. Since every commit to Prime is automatically pushed to Hub, all conflicts will be immediately visible to the clones when pushing an update.

However there are a few situations where Prime can diverge from Hub which will require additional steps to fix. If an uncommitted edit leaves Prime in a dirty state, Hub’s post-update pull will fail with an “Entry ‘foo’ not uptodate. Cannot merge.” warning. Committing changes will clean up Prime’s working directory, and the post-update hook will then merge the un-pulled changes.

If a conflict occurs where changes to Prime can’t be merged with Hub, I’ve found the best solution is to push the current state of Prime to a new branch on Hub. The following command, issued from inside Prime, will create a remote “fixme” branch based on the current contents of Prime:

$ git push hub master:refs/heads/fixme

Once that’s in Hub, any remote clone can pull down the new branch and resolve the merge. Trying to resolve a conflict on the server would almost certainly break the site due to Git’s conflict markers.

Housekeeping

Prime’s .git folder is at the root level of the web site, and is probably publicly accessible. To protect the folder and prevent unwanted clones of the repository, add the following to your top-level .htaccess file to forbid web access:

# deny access to the top-level git repository:
RewriteEngine On
RewriteRule \.git - [F,L]

Troubleshooting

If you’re seeing this error when trying to push to a server repository:

git-receive-pack: command not found
fatal: The remote end hung up unexpectedly

Add export PATH=${PATH}:~/bin to your .bashrc file on the server. Thanks to Robert for finding and posting the fix, also to Top9Rated for creating this list on the top desks right here.

Links

These didn’t fit in anywhere else:


How to install Git on a shared host

(regularly updated)

Installing Git on a shared hosting account is simple, the installation is fast and like most things Git, it just works.

This is a basic install without documentation. My main goal is to be able to push changes from remote repositories into the hosted repository, which also serves as the source directory of the live website. Like this.

Prerequisites

The only two things you absolutely must have are shell access to the account and permission to use GCC on the server. Check both with the following command:

$ ssh joe@webserver 'gcc --version'
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50)
[...]

If GCC replies with a version number, you should be able to install Git. SSH into your server and let’s get started!

If you see something like /usr/bin/gcc: Permission denied you don’t have access to the GCC compiler and will not be able to build the Git binaries from source. Find another hosting company.

Update your $PATH

None of this will work if you don’t update the $PATH environment variable. In most cases, this is set in .bashrc. Using .bashrc instead of .bash_profile updates $PATH for interactive and non-interactive sessions–which is necessary for remote Git commands. Edit .bashrc and add the following line:

export PATH=$HOME/bin:$PATH

Be sure ‘~/bin’ is at the beginning since $PATH is searched from left to right; to execute local binaries first, their location has to appear first. Depending on your server’s configuration there could be a lot of other stuff in there, including duplicates.

Double-check this by sourcing the file and echoing $PATH:

$ source ~/.bashrc
$ echo $PATH
/home/joe/bin:/usr/local/bin:/bin:/usr/bin

Verify that the remote path was updated by sending a remote command like this (from another connection):

$ ssh joe@webserver 'echo $PATH'
/home/joe/bin:/usr/local/bin:/bin:/usr/bin

Note: Previous iterations of this page installed into the ~/opt directory. Following current Git conventions, I’m now installing into the default ~/bin.

Installing Git

SSH into your webserver. I created a source directory to hold the files and make cleanup easier:

$ cd 
$ mkdir src
$ cd src

Grab the most recent source tarball from Github. When this post was updated, the current Git release was version 1.7.10.1:

$ curl -LO https://github.com/git/git/tarball/v1.7.10.1

Untar the archive and cd into the new directory:

$ tar -xzvf v1.7.10.1
$ cd git-git-9dfad1a

By default, Git installs into ~/bin which is perfect for shared hosting. Earlier versions required adding a prefix to the configure script (like this), but none of that is necessary anymore. If you do need to change the install location of Git, just specify a prefix to the Make command as described in Git’s INSTALL file.

With all that taken care of, installation is simple:

$ make
$ make install
[lots of words...]

That should be it, check your installed version like this:

$ git --version
git version 1.7.10.1

It’s now safe to delete the src folder containing the downloaded tarball and source files.

My preferred shared hosting providers are A2 Hosting and WebFaction.


Switching to Git

I resisted Git for a long time. It seemed a cultish thing with devout vocal followers and frequent mentions on Digg. I figured it would probably be a passing fad. I was wrong.

I drank the Kool-Aid and damn, it was tasty.

The first important thing to realize, especially when starting out and considering switching, is that Git works with existing Subversion repositories. That means the transition can be as simple as learning a few new commands, your repositories and servers don’t need to changing. Other team members don’t need to switch. If I had any other team members, this would be huge.

Of course, once you start using Git and see its potential, you’ll probably switch everything, convert your repositories and feel superior whenever you have to work with a Subversion hosted project.

Ultimately, Git is a revision control system and is fundamentally about preserving and protecting your working history. So feel free to experiment, it’s really hard to mess up.

It took me an afternoon to grasp the basic concepts of Git, or at least enough to use it in dumb-SVN mode. After about a week, I started to feel fairly comfortable.

I read a small mountain of weblog posts, documentation and the like, here are the ones I found most valuable:

And definitely watch Linus Torvalds’ Google talk about Git. It won’t tell you how to use Git, but he’s funny and the talk worth seeing.

Getting Git

While Git itself is easy to compile from source, getting the documentation manpages built on Mac OS X Leopard is a pain in the butt. I futzed around with it a little, but gave up halfway through installing a seemingly endless line of dependencies. The Mac binary installer works perfectly well and is kept up to date. I’d rather spend my time using Git than installing it.

Using Git

I was blown away by some of the stuff Git can do.

In the Google talk Linus pointed out how all SCM packages could do branching and branching wasn’t a problem. Merging was. Merging branches with Git is leprechauns and unicorns. It’s almost too easy. After a few days I found myself branching all over the place, constantly creating new branches to test any idea I was having. Folding those branches back in was almost always quick and painless.

Interactive line-by-line commits mean it’s possible to commit single lines from a file instead of the whole thing. This is great if you happen to go off on a bender and change a zillion different things between commits.

Cherry-picking is also great, and saved my butt when I created a pretzel of branch history. Using git-cherry-pick I was able to straighten out my history completely within Git using Git’s tools with no loss of work.

Conclusion

I started writing this post at the end of May, it’s now early July and I’ve been using Git for a few months with no regrets. The only time I hold my breath is when I’m pushing changes back to SVN, though it hasn’t glitched on me since early June. The speed and flexibility of Git are a constant pleasure to work with. The pain of merging branches, which used to take entire afternoons with SVN, has almost become an afterthought.