Project Home

Nabu: Usage Intructions

Abstract

Instructions for Nabu users. Intended audience: people writing content to be published. This document does not contain information about setting up the server nor about the various kinds of entries.

Contents

What is Nabu?

Nabu is a framework for the publication of documents and automated extraction of semantically meaningful parts of documents to populate a database on a server. It is intended for the users to only have to write simple text files on their local computer disk, and to have a minimum of software dependencies to rely on to be able to publish those incrementally to a server. It provides an easy way to maintain and create a collection of documents.

The extracted data be presented on the server in various ways, by any available web framework of your preference. The goal of this project limits itself to the publishing mechanism, the extraction of meaningful document parts, and to the storage of this data in a generic way.

Creating Files

Create text files as you normally would, with your favourite text editor (we recommend emacs or vi). The location or directory organization of those text files is not very important; make it the most convenient for you, the writer. The server may offer features to let you organize and categorize files in a way that lets it present them coherently (this depends on the code that will present the information and is outside the scope of Nabu).

RestructuredText

The files should be in valid reStructuredText format. You can find out more about RestructuredText at http://docutils.sourceforge.net/rst.html .

Files that do not validate the rst format may not be able to fully populate all the data contained in them to the database. See the section on Dealing with Errors for more details.

Marking Files for Publication

Files that will be published need to be marked with a unique string identifier, so that the source document can be matched against the data entries in the database. This marker will be automatically removed by the publisher before the document is sent for processing to the server.

The default scheme is to add a marker like this, anywhere in the first 2048 characters of the document:

:Id: ffb7d2cd-efce-43da-8aa0-fdb1a104dde6

or:

:Id: branching-instructions

You can also put the string in a comment along with other things, for example, as the first line of a file:

.. -*- coding: utf-8 -*-   :Id: branching-instructions

This string depends on the publisher and you can customize it. There is a regular expression in the publisher options that can be changed to recognize a different pattern for unique id.

Important

The identifiers that you use should be unique in the entire set of files that will be published in the database;

The author likes to use universally unique identifiers (UUID) for that purpose. Those have unique guarantees.

Entry Types

One of the powerful ideas behind this publishing system, is that the publisher handler will automatically recognize certain patterns in the source files and populate a database with those.

The kinds of entries recognized depend on the server configuration. Consult the person who is responsible for that server to find out which text patterns are recognized (there should be an available document for these).

Version Control

If you would like to put the files under version control, you can do that, using whatever system you like, but you really do not have to. Nabu does not force you to use any particular system for safeguarding your files.

Note that if you are putting your files under version control, one convenient arrangement is to allow a commit/checkin server trigger to do the publishing. This way, every time you commit new changes the new extracted items or documents will automatically appear in the database. Also, this has the advantage to work for all the users committing to a repository. See section on Repository Triggers to find out how to set it up.

Publishing Files

Getting the Publisher Client

The Nabu Publisher Client is used to find files that have changed in your local copy and push those files out to the server for processing and extraction. It is meant to be as smart and automatic as possible, to work incrementally, and does not store any files locally.

Furthermore, it is a small program that fits in a single Python file, so it should be easy to install anywhere and should run on any platform that can have a Python interpreter on it. Its only dependencies are the Python interpreter (version >=2.3).

You can fetch the Nabu Publisher Client and save it under the name nabu.

Pointing to a Server

You publish files to a Nabu publishing handler URL, you need to tell nabu what are the connection parameters, the server, and the user and password. This can be setup in your publisher/client configuration file. The publisher will not run without this information.

This file is found under your home directory as ~/.naburc. You need to create it to setup your connection parameters [1].

A typical ~/.naburc file will look something like this:

user = '<username>'
password = '<password>'
server_url = 'http://furius.ca/nabu/cgi-bin/nabu-publish-handler.cgi'
exclude = ['.svn', 'CVS', '*~']
#verbose = 1
[1]Alternatively, you can set the NABURC environment variable to tell the client where to find the configuration file.

Dealing with Errors

Even though the reStructuredText syntax is very simple, the nature of simple text files makes it so that it is possible that the documents that you create contain structural errors, because they contain text that is not valid reStructuredText syntax. When parsing the source documents with docutils, some errors can be output by the parser.

If you are processing the files on the server (the default), and if the server is setup to process the files asynchronously, you will not see those errors immediately. There is a command in the publisher that will fetch the errors from the server and display them. Use that after a while to find out what the processing errors were.

Note

For emacs users: the errors are formatted in a way that it is possible to use the next-error feature for emacs to quickly find them out. You can just run nabu from within emacs if you like.

Triggers

When should you run the publisher?

  • whenever you want to publish, from the command-line;
  • as a hook on a repository server. The repository server could update a checkout on a commit trigger and then publish to the Nabu server from that;
  • you could run it from emacs;
  • you could even bind it to an emacs save hook;

Repository Triggers

It can be nice to setup the publisher to run automatically everytime files are committed to a source code management repository. This can be most easily setup by adding a trigger script on the repository host, by keeping a checkedout copy updated and running the publisher client on that copy [2]. The rest of this section contains recipes to accomplish this using various SCM systems.

[2]Note that the Nabu server does not have to be the same host as the repository server.

Server Triggers using CVS

To implement triggers using CVS, edit the CVSROOT/loginfo file and add a line like this:

^project1  (date; cat; (cd /some/dir/checkouts/project1;
            cvs -q update -d; nabu; echo DONE) &)
            >> $CVSROOT/CVSROOT/nabu.log 2>&1

Where you would replace project1 by your own directory. Make sure that the repository will run with the appropriate Nabu configuration.

Server Triggers using Subversion

To implement triggers using Subversion, create a post-commit script which updates an existing checkout of your body of text files and then simply runs the nabu publisher client on it, something like this:

REPOS="$1"
REV="$2"
DIRS=$(/usr/bin/svnlook dirs-changed "$REPOS")
LOGFILE="/home/blais/tmp/copies/log"

for d in $DIRS; do
    if [ "$d" = "notes/" ]; then
        cd /home/blais/tmp/copies/$d
        svn update >>$LOGFILE 2>&1
        /usr/bin/nabu -v >>$LOGFILE 2>&1 &
    fi
done

Users Policy

Right now, the way the Nabu servers are implemented, you can see, dump and overwrite the contents of other users [3].

[3]It would be possible to implement per-user document sets by combining the unique id with the username to form the new unique id. This requires some changes on the server.