Integrating Gitweb Output Into A Wordpress Page

One of my known stumbling blocks to converting my site to WordPress was how was I going to manage to incorporate gitweb so that its output appears within the middle of a page, formatted to at least look like a normal page. The problem occurs because gitweb takes over managing the output and writes an html page complete with headers and footer. You are allowed to provide your own site_header.html and site_footer.html files, but these are inserted just after the opening tag and just before the closing tag. There is no way you can add to the header.

One approach to this, and the one I took on my previous version of this site, was to embed gitweb within an <iframe>. Effectively gitweb’s output is sandboxed within the <iframe> context. Unfortunately, despite some javascript trying to detect it, when the quantity of output changed the size of the iframe needed to change and it didn’t seem to do that automatically. I often found the output was chopped off (with a scroll bar provided) even when my full page was nowhere near full screen.

I have taken a different approach this time. In outline I have created a holding page with permalink “/software/” to hold the repository code and then created a special template for my site theme which when called embeds output from gitweb into the page. There are a few tweaks to the header is moved to the correct place and that links within the output are correct. Down to the detail.

The first step is to set up apache to serve gitweb at a known location. I do this with the following inside the apache virtual host directive;

ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
Alias /gitweb/ /usr/share/gitweb/

<Directory /usr/share/gitweb/>
  Order allow,deny
  allow from all
</Directory>

<Directory /usr/lib/cgi-bin>
  <Files gitweb.cgi>
    Options ExecCGI FollowSymLinks
    Order allow,deny
    Allow from all
    SetEnv GITWEB_CONFIG /etc/gitweb.conf
  </Files>
</Directory>

The next step is to adjust gitweb.conf (as shown above this is in /etc to display the correct links this can be done as follows;

$my_uri=/software/;
$my_url=$my_uri;
# target of the home link on top of all pages
$home_link = $my_uri || "/";

The important element is the reference to the “/software/” uri.

We now need to prepare WordPress to accept this. Firstly we are going to create a page with a permalink of “/software/”. This can have whatever you want to start the page, but between its contents and any comments we will be inserting gitweb output. We prepare for that by creating a custom template page in your theme. I have called my git.php, but the name is not relevant. The important thing is to turn it into a custom template by including;

<?php
/**
* Template Name: Git Repository
*
* This is a custom page that tries to add a git repository by default
*

at the head of the page.

It is necessary to provide the correct output in the header page. My approach was to create the function inside my git.php file and test for its existence in the functions.php function that output header info (so that we only output the additional stylesheet requests on the correct page). This routine was fairly simple;

function czen_gitweb_header() {
  /* This routine is to output the headers that gitweb normally would, but
  * Isn’t because we are stripping them off
  */
  ?><meta name="robots" content="index, nofollow"/>
  <link rel="stylesheet" type="text/css" href="/gitweb/gitweb.css"/>
  <link rel="alternate" title="Git projects list" href="/cgi-bin/gitweb.cgi?a=project_index" type="text/plain; charset=utf-8" />
  <link rel="alternate" title="Git projects feeds" href="/cgi-bin/gitweb.cgi?a=opml" type="text/x-opml" />

  <?php
}

For the rest of the page, I used page.php from the twenty ten theme as a guide. Just before it calls comments_template() I added the following;

<div id="gitweb-output">
<?php
  $repository = file_get_contents(site_url()./cgi-bin/gitweb.cgi?.$_SERVER[QUERY_STRING]);//Retrieve output from gitweb
  /*
  * We are trying to extract the text between the body tags from this html page, but we also have to switch the links for
  * rss feeds to represent the raw gitweb output. This adds some complication
  */
  if(preg_match('#<body>(.*)<div class="page_footer">(.*?)\/software\/(.*?)\/software\/(.*?)<\/body>#sm',$repository,$match)) {
    echo $match[1],'<div class="page_footer">,$match[2],/cgi-bin/gitweb.cgi,$match[3],/cgi-bin/gitweb.cgi,$match[4];
    unset($match);
  }
  unset($repository);
?></div>

As you can see, I am fetching ALL of gitweb’s output using file_get_contents() and then passing that through a regular expression. This does two things. It strips of everything before and after the <body> tags (including the tags themselves) and also separates out the two links to “/software/” that come in (its) page footer and replaces them back with “/cgi-bin/gitweb.cgi”. These (and the equivalent in the header) define feeds which don’t need the WordPress infrastructure around them.

For now it appears to working well.

I am new to this, so if there are better ways of achieving it I would be interested to here. Please leave a comment.