Rhizome Manual |
- Overview
- Requirements
- Installation
- Running
- Using Rhizome
- Access Control
- Security
- Rhizome Customization
- Importing content and schema migration
- Exporting and static websites
Overview
Rhizome is a Wiki-like content management and delivery system that exposes the entire site -- content, structure, and metadata as editable RDF. This means that instead of just creating a site with URLs that correspond to a page of HTML, with Rhizome you can create URLs that represent just about anything, such as:
- structural components of content (such as a bullet point or a definition)
- abstract entities that can be presented in different ways depending on the context
- relationships between entities or content, such as annotations or categories
Rhizome is designed to enable non-technical users to create these representations in an easy, ad-hoc manner. To this end, it includes a text formatting language which similar to a Wiki's but lets you author arbitrary XML content and RDF metadata. And for developers, this allows both content and structure to be easily repurposed and complex web applications rapidly developed.
The long-term vision is that each Rhizome site will intertwine together, forming an emergent fuzzy taxonomy over a peer-to-peer network.
The nearer-term goals of Rhizome are:
- To allow (relatively) non-technical folk to create "Semantic web"-enabled web sites
- To provide a platform for the rapid-development of web applications
- To provide a test-bed for experimenting with new forms of collaborative knowledge production and communication
- A showcase and test-bed for its underlying technologies: Rx4RDF, ZML, and Raccoon
Wiki ease
- All the functionality of a Wiki: public and automatic creation of pages and links using simple text formatting rules
- But the same formatting rules can be used to author arbitrary HTML, XML, and RDF
- And you can create and edit not just content but also the metadata, site structure and appearance, even application behavior
- Including dynamic pages: supports XSLT, RxSLT, Python, RxUpdate
- Thus enabling dynamic, rule, and context based presentation of content
Advanced Content Management functionality
- pages can consist of any content: xml, html, binary, etc.
- content, metadata, and site structure is stored as RDF: enabling it to be repurposable, human editable, application agnostic, etc.
- flexible authorization and security model
- staging/release workflow
- native versioning of content and metadata, conflict detection
- supports local file system or browser-based development
- can generate static websites
- import/export of content and metadata
- flexible backend: supports multiple RDF engines (Redland, 4Suite) with multiple datastores: file based, SQL databases, embeddable databases (Sleepycat, Metakit)
Key terms and concepts
Below are some terms (in alphabetical order) that are used through this manual.
- item
- Somewhat informally, a resource that is content, specifically that of type NamedContent.
- model
- The collection of resources and their properties that make up an instance of Rhizome. Generally refers to its logical representation, while store refers to its physical location.
- object
- In RDF, the value of a property
- predicate
- In RDF, the name of a property
- property
- A property consists of a name (a URI) and a value. Resources have properties.
- resource
- A resource is an abstract name for something (anything). Rhizome treats everything as a resource with a set of properties and every resource is named with a URI.
- Raccoon
- The application server that Rhizome runs on.
- RDF
- Resource Description Framework -- the "native format" of Rhizome
- RxML
- An alternative syntax for RDF which designed to be usable by even those ignorant of RDF.
- store
- The physical location of the model. By default this is will be a text file, but you can configure Rhizome to use a variety of databases.
- subject
- In RDF, the resource that a property applies to.
- URI
- Unique Resource Identifier, aka URL (except it doesn't have to work in your browser).
- WikiName
- A name of a resource that is unique to the site and so can be used to link to that resource.
- ZML
- An alternative syntax for XML that is designed to be relatively painless to author; similar to Wiki text formatting rules.
Requirements
Rhizome requires Python 2.2 or later (2.4 recommended) and the 4Suite XML and RDF libraries (at http://4Suite.org, version 1.0a1 or later).
Rhizome should work on any platform that supports Python and 4Suite. See Download for details on supported platforms.
Optional Packages:
If Lupy is installed (http://www.divmod.org/Home/Projects/Lupy), Rhizome will perform full-text indexing of content (requires Python 2.3).
Redland RDF or RDFLib data stores can be used if Redland (http://www.redland.opensource.ac.uk) or RDFLib (http://rdflib.net) are installed. See the Raccoon manual for more information.
Installation
This is a standard Python source distribution. To install:
- Unzip rx4rdf.zip or rx4rdf.tar.gz
- Run python <unzip dir>/setup.py install
This installs:
- a package named "rx" in the Python site-packages directory
- shell scripts or .bat files for running Raccoon and ZML in the Python scripts directory
- a directory named "rx4rdf" containing documentation, Rhizome pages, and other ancillary files in the Python share directory.
Running
Rhizome is an application that runs on Raccoon, a simple application server. Rhizome consists entirely of a Raccoon config file and a bunch of web pages. To run Rhizome, run Raccoon specifying your Rhizome config file using the -a option, e.g.:
- cd <your application's home dir>
- <python script dir>/run-raccoon -a <python share dir>/rx4rdf/rhizome/rhizome-config.py
This will launch Raccoon's built-in http server, which runs on port 8000 by default. You can change this by editing server.cfg (see the Raccoon Manual for more information).
That's it! You now have a new instance of Rhizome running. By default, the site will be stored in the current directory in file called "wikistore.nt" and content will be stored in a subdirectory called "content". Most of the Rhizome's behavior is defined in its pages so you can start customizing your Rhizome instance by editing its pages and metadata from within Rhizome.
Initial Configuration
However, there are few additional things you might want to configure first if you're deploying a live website. First, you should create your own config file to separate your config settings from the stock Rhizome config settings. The easiest way to do this is to copy the files in the blank directory to the directory where you want your site to live and then edit your copy of blank-config.py. Look at the two most important settings in that file:
BASE_MODEL_URI='http://www.example.com/'
__include__('../rhizome/rhizome-config.py')The first line sets the base URI that will be used for creating RDF resources. Note that this does not have to correspond to any resolvable URL. The second includes the Rhizome config file.
Next, you probably want to set administrator password and the secure hash seed. Rhizome creates a default administrator super-user that has full access to the system, and its login name and password both default to 'admin'. To set your own password, in your config file set ADMIN_PASSWORD before including the Rhizome config. If you don't want to store the password in cleartext in your config file, you can set ADMIN_PASSWORD_HASH instead, which requires a SHA1 digest of the concatenation of the password and the secure hash seed. In Python, you can calculate it like this: "import sha; sha.sha( 'yourpassword' + 'your secure hash seed').hexdigest()"
As illustrated above, the secure hash seed is used to generate a digest that can be publically displayed. To set your own private seed, set the SECURE_HASH_SEED setting in your config file before including the Rhizome config. If you don't set your own private seed, or if it is compromised, it will be much easier to mount a dictionary attack on the password hashes. Note that if you change its value, all previously generated password hashes stored in your model will no longer work.
Text Indexing
Rhizome has preliminary support for text indexing. If Lupy is installed Rhizome will index content when pages are saved (currently, only the contents and the title). On start-up, Rhizome checks if the directory specified by the INDEX_DIR config setting exists (the default is 'contentindex'); if it doesn't then all the pages will be indexed. Thus, to regenerate the index delete the directory and restart Rhizome.
Development Configuration
While developing an application on top of Rhizome it is often useful to run it with a more developer friendly configuration -- for example, to enable Rhizome to immediately pick up changes to files modified by external programs (such as a text editor). This is illustrated in the configuration file debug-confg.py included in the distribution:
__include__('site-config.py')
#have Raccoon also check if the underlying files have changed
LIVE_ENVIRONMENT=1
#disable Python content authorization
authorizeContentProcessors[
'http://rx4rdf.sf.net/ns/wiki#item-format-python'] = lambda *args: 1
More Configuration
For information on configuring the Raccoon application server, for example, to run it behind an Apache web server, see the Raccoon Manual. For complete documentation on all the settings available in your config file see this sample config file. Other common configuration tasks include choosing a model store, setting the PATH to search for files, and setting the application's base URL.
Using Rhizome
Creating and Editing Resources
You can create and edit any type of resource in Rhizome, but it is primarily focused on content item resources (aka plain ol' pages), which are internally called NamedContent. If you click on the "New" or (most of the time) the "Edit" link on the default template you will see the edit page specific to this type of resource. To edit or create other types of resources, go to the administration page (the "Admin" link on the default template). There you will be able to view or manage any type of resource in the system.
Editing Content Items
Below is a guide to the form controls found on the edit and new pages:
- Name
- When creating content for the first time the Name edit box will be shown. Enter the name of the content here -- it should be a name unique to the site. If the name contains slashes (e.g. 'foo/bar') the content will placed in the folder based on name, creating the folder structure if necessary (e.g. content will be put in the folder named 'foo' and the content given the name 'bar'). If the "Anonymous" option is selected you do not need to give the content a name (internally it will be given a name based on the current time and date). If the "Anonymous" is selected and there is a name entered in the "Name" edit box, then the name will be treated as the folder path in which to place the anonymous content (creating the folder structure if necessary).
- Title
- Enter the title for the content (optional). Unlike the name it can be changed anytime and contain any type of characters.
- Content
- You create the item's content either by entering it in the editbox or by uploading a file.
- Source Format
- This dropdrown specifies the format of the content you are entering and drives how Rhizome will process the content. The following list describes how Rhizome will handle the different choices:
- ZML
- ZML (the default choice) is a text format similar to the ones found in Wikis except that you can use it author arbitrary HTML or XML.
- Binary
- Binary content is content such as an image or multimedia file. It is never tranformed or manipulated by Rhizome. Generally you'll want to set the Item Type (see below) to "Page".
- HTML/XML
- HTML and XML text will have any link that starts with 'site:' replaced with a live URL depending on the request context. Using 'site:' links allows links on a page to be work even if the page moves to another directory or transformed in other contexts (such as statically exported). The 'site:' URL scheme is similar to the 'file' scheme: URLs that start with 'site:///' will be relative to the root of where the Rhizome site is running from, while URLs that start only with 'site:' will be relative to the current location of the page.
- Python
- The contents is treated as Python code that is executed when the page is requested. Anything the code writes to stdout is captured and sent as the response. By default, the administrator must explicitly authorize each page of Python code or an Unauthorized error will occur when attempted to view the page.
- RxSLT
- RxSLT is a XSLT sheetsheet that can create HTML or XML output using the site's model as the stylesheet's source.
- RxUpdate
- RxUpdate is an (enhanced) XUpdate document that is used to update the site's model. RxUpdate pages are typically used as form handlers (for example, 'save' page) and generally you'll want to set its Item Type to 'Handler'.
- Text
- Plain text. When included in HTML (for example, by the "Entry" template), it will be escaped.
- XSLT
- A XSLT stylesheet. It uses the current content of the request as the document source, so you'll need to set this page up as a template by adding a "wiki:handles-doctype" property. You'll also probably want to set the Item Type to "Template".
- NTriples
- NTriples is a plain text (as opposed to XML) based format for RDF.
- Turtle
- Turtle (Terse RDF Triple Language) (only if the Redland package is installed) is another plain text format for RDF.
- Minor Edit
- When you check this option you mark this revision of the content as being a minor edit. If you also edited the last revision of this item, then the current revision will replace the previous one. (This prevents cluttering the site up with revisions to just to fix typos, etc.)
- Keywords
- A comma or space-separated list of keywords. Keywords provide simple way to associate arbitrary metadata with the content item. Each keyword will be asociated with the content item via the wiki:about property. You can browse pages in the system grouped by keywords by choosing Browse in the page footer.
- Item Type
- The Item Type (internally called the Item Disposition) specifies how this content item should be used. It is used to select a template stylesheet that will be applied to the content. The following are included by default:
- Page
- This means that content is a stand-alone page and no template should be applied to it. For example, you probably want to set this for images.
- Entry
- (the default). This frames the content in the default site-template.
- Handler
- The content is used to update the site or handle a POST from a form page. Typically used for RxUpdate or Python pages that might not create any response output and for pages that handle write actions for other types of resources. The default handler template just prints the message "Completed <action name> of <resource name>!"
- RxML Template
- Indicates the content is text containing RxML that when invoked will bring up an edit metadata page containing the RxML. When saved, Rhizome will treate all the properties as new and replace any anonymous (aka blank) resources (those whose name starts with 'bnode:') with new anonymous resources. Using RxML templates provides a quick and dirty way of adding new types of resources to the sytem without having to build a form UI for creating and editing them.
- Template
- The content will be used as template (typically for XSLT content). Currently, this just has the effect of stopping any more templates from being applied to the content.
Advanced Options. Pressing the "More" button will reveal the following options:
- Sharing
- This dropdown lets you choose how the content can be accessed. Choices depend on rights the user has. The Guest user (the default if you are not logged in) only has one choice: "Public", which means anyone can view, edit, or delete the content. If you sign in as a user, you'll see choices that allow you to make content only readable and writable by you (and administrators), or public readable but only privately modifiable. You can also set the content to be readable and/or writable only to members of the user group you belong to.
- Label
- This dropdown shows all the labels in the system that you can choose to apply to this revision of the content. The default configuration of Rhizome has two labels, "Draft" and "Released". To have no label associated, choose the blank option. When viewing content, the last revisions with the "Released" label will be shown. If there is no revision labeled "Released" the last revision not labeled "Draft" is chosen. This behavior can be overridden by adding a label or revision parameter to the URL. By default, only administrators can set the "Released" label on a revision. Setting this enables a review process since all future revisions will need to be marked "Released" by an administrator before they can be viewed. Non-administrators can prevent content they are working on from being displayed by marking it "Draft".
- Shred
- This checkbox (default: on) indicates whether or not Rhizome should try to extract metadata from the content when saving it.
- Change Comment
- An optional comment to be associated with your revision.
Assigning metadata and properties to the content
After you save your changes, you can add or modify arbitary properties and metadata for the content item and its latest revision. To do so, after saving choose the metadata link, which will display the metadata view of the content. Then choose the "Edit Metadata" at the top of the metadata view page.
The edit metadata page presents the content's resources as editable RxML. You will also see links that let you edit the metadata in two other RDF formats, RDF/XML and Ntriples. For an example, click here to edit the metadata of this page.
In addition to adding arbitrary metadata there are a few properties that affect Rhizome's behavior that are not managed by the edit page. Some of these include:
- wiki:alias
- This property can be assigned to any resource that uses the wiki:name property and lets you give a resource an alternative name. Its value is identical to the value of a wiki:name: it may be a hierarchical path that corresponds to its folder structure and the name should be unique across the site.
- wiki:handles-disposition
- When creating a template, use this property to indicate that content items that have the specified item disposition should have this template applied to it. Its value is a Item Disposition resource.
- wiki:handles-doctype
- When creating a template, use this property to indicate that content items that have the specified document type should have this template applied to it. Its value is a DocType resource.
- wiki:handles-action
- This property, along with wiki:action-for-type, indicates that this resource should be used to handle the specified action.
- wiki:action-for-type
- Used along side wiki:handles-action to specify the types of resources this action handler should apply. Its value should be a resource type or rdfs:Resource (action handlers with the latter have lower priority than ones that specify a specific type).
Creating and Editing Other Resources
You can browse and edit all the types of resources in the system from the administration page (click on the "Admin" link). Unless a resource type has an edit handler associated with it the default edit handler will be used, which lets you directly edit the resource as RxML (and other RDF formats). You can designate a resource as an edit handler for a given type of resource by assigning it the wiki:handles-action and wiki:action-for-type properties, as described above.
There are a few useful properties that can be assigned to any resource:
- wiki:name
- Any resource that has this property can be addressed with an URL, just like the item content name described above. Names should unique across the site, with the exception that you can also have a content item with the same name (and will be invoked instead).
- wiki:alias
- See above.
- rdfs:label
- Many of Rhizome's pages check for this property when trying to display a human-readable name for a resource.
When you request an URL from Rhizome, it translates the URL to an internal resource, applies an action to the resource, and returns the result of that action to your browser. The default action is 'view' but you specify other actions by adding an 'action' parameter to a URL (or, when creating an html form, by adding an 'action' form variable) -- for example, look at the links such as 'Edit' or 'Revisions' at bottom of the default template.
URLs are mapped to resources through a series of steps:
- Find the source resource. Rhizome applies the following queries, stopping at the first match:
- If the URL has an 'about' parameter, find the resource whose URI matches the parameter's value.
- Look for a resource that has a wiki:name property that matches the URL's path.
- Look for a resource that has a wiki:alias property that matches the URL's path.
- Look in the directories in Rhizome's PATH setting for a file that matches the URL's path. If one is found none of the following steps apply and the contents of the file is returned immediately.
- Look for a resource that has a wiki:name property named '_not_found'. By default, this will be a page that prompts you to create a new page using the missing name.
- See if the user is authorized to perform the action associated with this URL on the selected resource. If not, set the resource to auth:Unauthorized.
- Look for a resource that can handle the action associated with this URL when applied to the selected resource.
- If the action is 'view' and resource is of type 'NamedContent', choose the same resource.
- Search for a resource with a wiki:handles-action property equal to the action and a wiki:action-for-type property equal to the type of the current resource or 'rdfs:Resource'.
- Look for a resource that has a wiki:name property named 'default-resource-viewer'. The default view displays the resource as RDF (in RxML or other RDF formats).
- At this point the current resource should alway be of type NamedContent. Now find a revision associated with that resource (through the wiki:revisions property) that contains the resource's content:
- If the URL had a revision parameter, choose the revision that matches that number
- If the URL had a label parameter, choose the last revision that has a matching label
- Choose the last revision that has a label with the wiki:is-released property
- Choose the last revision
- Now we retrieve the content associated with the revision (which can be either stored in external files or within the model, depending on how you configured Rhizome) and process it based on the content transform metadata associated with it (this is set by the 'Source Format' dropdown on the edit page).
- At this point we may be ready to return the contents to the browser or we might want to apply a template to the results. To find out we look for a template resource that first matches the following queries:
- Search for a resource with a wiki:handles-doctype property equal to to the URL's '_doctype' parameter, if present, or the revision's wiki:doctype property. This allows for stylesheets that can convert content from one doctype to another.
- Search for a resource with a wiki:handles-disposition property equal to the URL's '_disposition' parameter, if present, or the revision's wiki:item-disposition property. Item disposition's corresponds to the "Item type" drop-down on the Edit page and allow for template stylesheets to control how content is rendered.
- If we do find a template resource, we stuff the content of the current request into a XPath variable called '$_contents' and then, starting with the fourth step, invoke the above steps again using template resource. Note that this means that another template can be invoked on the results of this template resource and so on and so on.
Folders
Hierarchical folder structures can be created in Rhizome simply using names that contain "/"s to deliminate the path. When a resource with such a name is saved the resource will be associated with the parent folder resource using the wiki:has-child property, and if the folder resource doesn't exist it will be created. The default view handler for a folder resource lists all the child resources of that folder.
RSS feeds
You can create a RSS feed (currenly only RSS 2.0) on nearly anything in the system by doing a search with the RSS view option. The URL for the search results can be used as the RSS URL. For example, for a RSS feed of the most recently changed pages, click on the "Recent" link, change the view to RSS, and then press the search button. Identical results need not be repeatedly sent as Raccoon always generates an etag for tools that support If-None-Match http headers.
Extracting metadata (shredding and GRDDL)
When you save content Rhizome analyzes the content for RDF metadata and saves it in its store. This is referred to as "shredding" the content. Rhizome includes the shredders discussed below, and more shredders can be added by setting the shredders config variable to associate new shredding ContentProcessors with different content types. Rhizome keeps track of the relationship between shredded metadata and its content source -- and when content is deleted or changed, the old extracted metadata is removed from the store. The default shredders behave as follows:
- If the content is RDF (supported formats include RDF/XML, RxML, Turtle, and NTriples) the RDF is parsed and added to the store.
- Otherwise, if the content is ZML, XML or XHTML Rhizome tries to shred the content based on the XML elements it encounters using the xml-shred.xsl stylesheet. It currently does the following:
- Supports GRDDL (Gleaning Resource Descriptions from Dialects of Languages): If XML contains dataview:transformation attributes or XHTML contains a GRDDL profile in its head element, those GRDDL stylesheets will be invoked and their results added to the store. In addition, Rhizome will search the store for any GRDDL stylesheets associated with a document namespace via the dataview:namespaceTransformation property, and invoke that stylesheet.
- For XML that doesn't use namespaces, it will try to figure out which XML vocabulary is used and associate the content with a wiki:DocType resource. DocTypes specifies what type of XML the document is. It is used to select a stylesheet for transforming the XML to HTML. By default, Rhizome supports DocBook and several Apache Forrest schemas. If a DocType is found, the shredder will also check if there is a GRDDL stylesheet associated the DocType via the (non-standard) dataview:doctypeTransformation property. (Currently Rhizome only provides a shredder for the Apache Forrest FAQ DocType.)
- Searches for html:a (and similar) elements to enable Rhizome to keep track of a page's outgoing links (including whether or not the target of a internal link exists or not).
Access Control
Access control (or authorization) provides control over which users can access or modify particular resources and functionality. This is particularly important with Rhizome because most of its structure and behavior is editable by users; indeed, the user can write pages in Python and execute arbitrary code. However, in many applications you don't want to have to think about access control, so Rhizome provides an access control model that protects the core structure by default but doesn't require extra work if you're not interested in extra control.
The basic notion is an Access Token that can be attached to (guard) any resource. Access tokens have one or more permissions associated with them, each of which grant the right to perform a specific action. A user that has rights to a particular Access Token can perform the granted actions on all the resources that the Access Token guards. In addition, Access Tokens have a priority property, enabling more important Access Tokens to override less important ones.
One benefit with this approach is how scalable it is in terms of the complexity of your authorization scheme. If you don't care about authorization at all you just can ignore it -- it only has an effect when you attach access tokens to guard resources. But you can create complicated authorization schemes by specifying your own authorization RxPath expressions by setting the authorization expression variables found in the config file, enabling you to create complex path traversals (e.g. inheritance, hierarchical groups, etc.) by redefining how a resource finds the Access Token that guards it or which Access Token a user has rights to.
With Rhizome's default authorization expression, user have rights to a token either directly via the auth:has-rights-to property or as a member of a Role (via auth:has-role) which in turn has rights to tokens via auth:has-rights-to. Similarly, resources are guarded by access tokens either directly via the auth:guarded-by property or indirectly by the access-tokens that guard their class resources.
The basic subject (or principle) of access control in Rhizome is represented by the foaf:OnlineAccount class; accounts can be members of a Role, which give the account rights to all the AccessTokens that the role has rights to. Accounts are almost always associated with a user (via the foaf:holdsAccount property); and users are represented by the foaf:Person class. However, Rhizome has two built-in accounts that are not associated with any particular user: The 'admin' account is assigned the 'super-user' role, which is a special role that has full access to the system regardless of access tokens. The 'guest' account is a member of the 'guest' role; anyone not logged-in treated as a 'guest'. By default, the 'guest' account can only perform actions that don't require any access token.
The Create/Edit User page allows you to choose which roles the user belongs, but the UI for this is only displayed when the current user (the one viewing the page) has the right to assign roles (this is determined by the auth:can-assign-role property), and only those roles that user can assign will be displayed.
When a new user is created, several new access tokens are created for the user, allowing the user to set different levels of privacy for the content she edits (to learn more, take a look at signup-handler.xml). When you create or edit a page you'll notice a dropdown box labeled "Sharing". This drop-down contains all the access tokens the user has rights to (if you're not logged-in, you'll see just see the default "Public Read/Write" option because the guest user doesn't have rights to any access tokens). When you save the page it will now require your chosen access token.
Notice that two of the "Sharing" options are labeled "Group" and "Group Write/Public Read" -- choosing these access tokens limit access to only those users who share membership in any of the User Groups you are a member of. A user group is a subclass of role which has the user's group access tokens added to the role when the user joins. This way any other user that shares membership with that user's user groups can access resources that are guarded by one of that user's group access tokens.
Fine-grained Authorization
In addition to permissions that correspond to the type of action associated with a request ("view", "edit", "save", etc.), access tokens can have permissions that exactly specify which properties can be added or removed from a given resource. The permissions auth:permission-add-statement and auth:permission-remove-statement control whether a resource can have properties added or removed. If they are modified by the presence of an auth:with-property property then they only apply to the properties that are the value of the auth:with-property property. You can further refine this with the addition of the auth:with-value property, which limits the access control to only those properties with that specific value. For example, the default authorization schema (found in the rhizome config variable authStructure) creates an access token to control who can add an item format with the Python source format.
In addition, Rhizome provides a simple access control scheme for modifying a request metadata via the assign-metadata and remove-metadata XPath extension functions: Any variable whose name that starts with two leading underscores ("__") is considered read-only and can not be assigned or removed. Similarly, external requests (e.g. form variables) cannot contain variables that start with two underscores (this is to prevent external variables from overriding the protected variables).
Security
Besides the access control functionality described above, Rhizome has the following security features:
Spam control
If a user hasn't been granted access to the create-nospam-token access token, Rhizome will subject the user to following two spam control checks:
- Before saving content (e.g. pages or comments) Rhizome will check the Akismet service to see if the content is spam. This functionality is enabled when you set the akismetKey and akismetUrl config variables (where akismetKey can be a free WordPress API key).
- When generating HTML from content created by users without the create-nospam-token access token, Rhizome will add the attribute "rel=nofollow" to all links (see http://microformats.org/wiki/rel-nofollow).
create-nospam-token is granted to the default user role, so only the Guest user will be subjected to these spam checks. For more stringent spam control, remove this token from the default user role.
Markup sanitization
Rhizome will sanitize HTML and XML generated by content created by any user that hasn't been granted access to the create-unsanitary-content-token Access Token. By default, only the administrator role is granted this token. Rhizome sanitizes markup using a black list as opposed to a white list -- this approach is necessary given its goal to allow users to create arbitrary markup. The default black list is designed to strip out any potentially unsafe markup for all mainstream browsers but the list can be modified through the blacklistedElements, blacklistedAttributes, and blacklistedContent config variables.
Rhizome also provides following security features that may require configuration. For more information on the settings mentioned below, see the config file documentation.
- Python pages will only be executed if the SHA1 digest of the page is listed in the authorizationDigests config variable.
- By default, URL resolution is limited to only access local file system paths specified by the PATH config variable. These limitations can be relaxed in various ways through the following config variables: SECURE_FILE_ACCESS, DEFAULT_URI_SCHEMES, uriResolveWhitelist, uriResolveBlacklist.
- The authorization and validation of XPath extension functions calls can be controlled by the authorizedExtFunctions config variable.
- Rhizome utilizes other security features provided by Raccoon which can customized by overriding various configuration settings (for example, the authorization of Content Processors and request metadata). See rhizome-config.py and the Raccoon manual for more information.
Rhizome Customization
Simple Customization of Presentation
Here are some files useful for the basic customization of Rhizome:
- rhizome/basestyles.css
- CSS formatting rules shared by all themes are found here.
- rhizome/sidebar.txt
- This ZML document is displayed as the navbar on the left hand column by the site-template.
- rhizome/site-template.xsl
- The XSLT stylesheet that controls Rhizome's layout by invoking the current theme.
There are three ways you could modify these files:
- Just edit them in Rhizome (e.g. click on the links above)
- Place a modified copy of the file in a directory that the appears in the Raccoon PATH before the 'rhizome' directory. By default, Rhizome places a directory named 'content' (relative to the current working directory) before the rhizome directory. (See RaccoonConfig for more on the PATH setting).
- Modify the files directly using any tool.
Themes
Rhizome supports themes, a simple mechanism for configuring the look and feel of a site. The theme renders the final layout of a page, for example, placing the content of a page within a column of a two or three column layout. A theme must consist of at least the two files that are referred to by site-template.xsl: theme.xsl which is responsible creating for the body of the page, and theme.css, which linked to the HTML generated by the site-template. See the comments in rhizome/site-template.xsl for information on creating the theme XSLT stylesheet.
You can create a new theme by creating a new wiki:SiteTheme resource. This resource must refer to the theme's xsl and css pages (see the create SiteTheme template). You can set which theme Rhizome uses by setting the wiki:uses-theme property associated with the site-template. If a theme refers to external files you it may be convenient to place those files in a separate theme directory and place that directory on the Raccoon PATH. If you haven't explicitly set the PATH, you can use the THEME_DIR config setting to include it in the PATH.
Rhizome ships with a few themes, see the rhizome/themes directory for the ones currently included.
Skins
In addition to themes, the site-template can be associated with a CSS file by setting the wiki:uses-skin property. This "skin" CSS file is designed to be applied to any theme (as long as the theme uses the appropriate CSS classes) and is used to create the site's color scheme and other simple visual effects (e.g. font size). A list of skin files can be found here.
Understanding the Schema
Rhizome uses the following RDF schemas:
- http://rx4rdf.sf.net/ns/archive# ("a" prefix)
- This OWL schema represents content and URLs that refer to content. It provides a way to unambigously describe the content an URL refers to, or to unambigously describe the ambiguity. Related to this goal, it also provides a way to describe deterministic transformations to content.
- http://rx4rdf.sf.net/ns/auth# ("auth" prefix)
- This schema represents the access control model described above. It is not yet formally documented as an ontology but see the default Rhizome authorization scheme in rhizome-config.py for more information.
- http://rx4rdf.sf.net/ns/wiki# ("wiki" prefix)
- This vocabulary contains a grab-bag of properties and classes created ad-hoc for Rhizome. There are no plans to formalize this into an ontology.
- http://xmlns.com/foaf/0.1/ ("foaf" prefix)
- FOAF is used to represent Rhizome users.
Modifying the Model Template
If the model store specified in the config file doesn't exist, Rhizome will create a new model by copying the contents of the STORAGE_TEMPLATE. You can modify the contents of this template by using the special functions that can be used in the config file: __addItem__, __addRxML__, and __addTriples__ -- see the config file documentation for more information. For example, this snippet from the site-config.py config file first adds a new page named "FOAFPaper" and then replaces the statements grouped together under the name "@sitevars" :
__addItem__('FOAFPaper',loc='path:.rzvs/FOAFPaper.zml', format='zml',
title="Rhizome Position Paper", accessTokens=['base:save-only-token'],
disposition='entry', doctype='document')
__addRxML__(replace = '@sitevars', contents = '''
base:site-template:
wiki:header-image: `Rx4RDFlogo.gif
wiki:header-text: `
wiki:footer-text: `© 2005 Liminal Systems All Rights Reserved
wiki:uses-theme: base:default-theme
wiki:uses-skin: base:light-blue.css
''')
Validation and update triggers
When customizing Rhizome's behavior or adding new resource types, it may be useful to modify two scripts, update-triggers, a RxUpdate script, and validate-schema, a Schematron stylesheet. Rhizome invokes these scripts (in this order) each time a transaction is about to be committed. Both pages expose two XPath variables, $_added and $_removed, which contain the Predicate elements that reference the all RDF statements are to be added or removed when the transaction is committed. Update-triggers invokes rules that can do such tasks as removing resources that are no longer in use, while validate-schema is used to validate that the consistency of the store will be maintained, and contains rules for testing referential integrity and uniqueness guarantees.
Slightly Involved Customization Example
In this example we'll add a template for displaying a print-friendly version of a page. (Note: in going through the trouble writing and testing this example, I've decided to leave it in the core product -- see print-template.xsl.)
- Add a new disposition
- On the administration page click on the new disposition link
- Edit the new disposition template: for consistency, we'll choose the resource URI of the new disposition to be "wiki:item-disposition-print" and label it 'Printable'
- Add a new template
- Click new to create a new page (let's call it 'print-template'), set the Item Format to "RxSLT" and the Item Type to "Template". (We don't want to set the format to "XSLT" because we don't want to transform the contents that we are applying the template to (as it can be anything, not even XML); instead we just want to include the content (which is exposed as a XSLT param named $_contents) in the template's output.
- We'll base the print template stylesheet on "site-template.xsl" by removing the structural tables from template and just displaying the content:
- Set the template to handle the new print disposition
- After saving the item, click on the "Metadata" link to get to the View Metadata page and then the "Edit Metadata" link to edit the metadata.
- Add a wiki:handles-disposition property to the main (the NamedContent) resource:
- Finally let's add "Print" link to the site template
- Edit "site-template"
- Find the action links in the footer of the template and add our own by copying the "view" link and adding a "_disposition" parameter to it:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:a="http://rx4rdf.sf.net/ns/archive#"
xmlns:wiki="http://rx4rdf.sf.net/ns/wiki#"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:wf='http://rx4rdf.sf.net/ns/raccoon/xpath-ext#'
xmlns:f = 'http://xmlns.4suite.org/ext'
xmlns:response-header='http://rx4rdf.sf.net/ns/raccoon/http-response-header#'
exclude-result-prefixes = "f wf a wiki rdf response-header" >
<xsl:param name="_contents" />
<xsl:param name="_previousContext" />
<xsl:param name="response-header:content-type"/>
<xsl:output method='html' indent='no' />
<xsl:template match="/">
<!-- this page is always html, not the content's mimetype -->
<xsl:variable name='prev-content-type' select="$response-header:content-type" />
<html>
<head>
<!-- we could have a different stylesheet for printing -->
<link href="site:///basestyles.css" rel="stylesheet" type="text/css" />
</head>
<body>
<xsl:choose>
<xsl:when test="contains($prev-content-type,'xml')
or starts-with($prev-content-type,'text/html')">
<xsl:value-of disable-output-escaping='yes' select="$_contents" />
</xsl:when>
<xsl:otherwise>
<pre>
<xsl:value-of disable-output-escaping='no' select="$_contents" />
</pre>
</xsl:otherwise>
</xsl:choose>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
base:print-template: wiki:handles-disposition: wiki:item-disposition-print ...
 <a href="site:///{$path}?_disposition
=http%3A//rx4rdf.sf.net/ns/wiki%23item-disposition-print
{$aboutparam}">Print</a>- The _disposition parameter overrides the resources' default disposition that was set when choosing the "Item Type" on the edit page.
Importing content and schema migration
The import command adds content to the site by adding all the files that match the given path (with wildcards).
The command has these options:
--import path [--recurse] [--filenameonly] [--dest path]
[--xupdate url] [--folder path] [--noindex]
[--noshred] [--format format] [--doctype doctype]
[--disposition disposition] [--token accesstoken]
[--label label] [--keyword keyword]where
path Location of files to import (* and ? wildcards ok)
--recurse if present, recursively import subdirectories
--filenameonly (with recurse) don't include the relative path in the imported item name
--dest dir If dest is present files will be copied to this directory, otherwise the site will directly reference the imported files.
--xupdate URL URL to an RxUpdate file which is applied to each metarx file if present.
--noindex Don't add imported content to the full-text index
--noshred Don't shred content on import. The following are applied to a file if no metarx file is present:
--folder path Prepends path to name and creates folder resource if necessary
--keepext Don't drop the file extension when naming the page.
--format URI or QName default wiki:item-format value
--disposition URI or QName default wiki:item-disposition value
--doctype URI or QName default wiki:doctype value
--label URI or QName default wiki:has-label value
--token URI or QName Access Token to guard (auth:guarded-by) the content
--keyword URI or QName default wiki:about value
If, for each file, there exists a matching file with ".metarx" appended, then import will attempt to add the metadata in the metarx file. First it loads the metadata file and then updates it using the RxUpdate file specified by the --xupdate option. If it isn't present it will run the default, "path:import.xml". This RxUpdate script disgards previous revisions and points the content to the new import location. You'll find a couple of other import scripts in the rhizome directory: one changes the base URI for the RDF resources, and another adds authorization tokens to the items. By writing your own import scripts and exporting and reimporting your site, you can migrate your site to your latest schema.
If no .metarx file exists import will use the defaults specified by the --format, --disposition, --doctype, etc. options, if present. Their value can be either an URI or a QName. If a default is not specified, for required metadata import will attempt to guess at the metadata based on the contents and file extension of the imported file.
Exporting and static websites
You can use Rhizome's export command to export the content of each item in the site as a separate file. You can use it for two different types of tasks: to export raw content and metadata (for data exchange or schema migration, etc.), or to generate static versions of the website.
The command has these options:
--export dir [--static [--noalias] | --astriples filepath]
[--xpath exp | --name name] [--label label]where
dir is the directory to export to
-xpath RxPath expression that evaluates to a nodeset of items to export
-name The name of item to export (for exporting one item) (no effect if -xpath is specified)
-label Choose the last revision that matches this label
-static Export as static HTML
-noalias Don't create static copies of page aliases (use with -static only)
-base Base URL for links (use with -static only)
-astriples filepath Export the selected resources as an NTriples file. External content is inserted into the file as string literals.
If the -static option is present, export will try to render each item as HTML. Dynamic pages (e.g. those that require query parameters) are skipped (you may see exceptions being reported to the consol). A $_static variable is introduced so stylesheets can render appropriately (for example, see site-template.xsl). Limitations:
- External content (images, etc.) referenced by links are not copied
- It would be nice if pages without extensions were renamed with an extension. This is not yet supported.
If the -static or -astriples options are not present each item will be exported to 2 files: one containing the raw content of the item and the other a RxML document containing the metadata associated with the item. The first file will match the item's name with a file type extension added, if necessary, and the second will be the same but with '.metarx' appended.

