- Using Rhizome
- Access Control
- Rhizome Customization
- Importing content and schema migration
- Exporting and static websites
Rhizome is designed to enable non-technical users to create these representations in an easy, ad-hoc manner. To this end, it includes a text formatting language which similar to a Wiki's but lets you author arbitrary XML content and RDF metadata. And for developers, this allows both content and structure to be easily repurposed and complex web applications rapidly developed.
The long-term vision is that each Rhizome site will intertwine together, forming an emergent fuzzy taxonomy over a peer-to-peer network.
The nearer-term goals of Rhizome are:
- To allow (relatively) non-technical folk to create "Semantic web"-enabled web sites
- To provide a platform for the rapid-development of web applications
- To provide a test-bed for experimenting with new forms of collaborative knowledge production and communication
- A showcase and test-bed for its underlying technologies: Rx4RDF, ZML, and Raccoon
- All the functionality of a Wiki: public and automatic creation of pages and links using simple text formatting rules
- But the same formatting rules can be used to author arbitrary HTML, XML, and RDF
- And you can create and edit not just content but also the metadata, site structure and appearance, even application behavior
- Including dynamic pages: supports XSLT, RxSLT, Python, RxUpdate
- Thus enabling dynamic, rule, and context based presentation of content
Advanced Content Management functionality
- pages can consist of any content: xml, html, binary, etc.
- content, metadata, and site structure is stored as RDF: enabling it to be repurposable, human editable, application agnostic, etc.
- flexible authorization and security model
- staging/release workflow
- native versioning of content and metadata, conflict detection
- supports local file system or browser-based development
- can generate static websites
- import/export of content and metadata
- flexible backend: supports multiple RDF engines (Redland, 4Suite) with multiple datastores: file based, SQL databases, embeddable databases (Sleepycat, Metakit)
- Somewhat informally, a resource that is content, specifically that of type NamedContent.
- The collection of resources and their properties that make up an instance of Rhizome. Generally refers to its logical representation, while store refers to its physical location.
- In RDF, the value of a property
- In RDF, the name of a property
- A property consists of a name (a URI) and a value. Resources have properties.
- A resource is an abstract name for something (anything). Rhizome treats everything as a resource with a set of properties and every resource is named with a URI.
- The application server that Rhizome runs on.
- Resource Description Framework -- the "native format" of Rhizome
- An alternative syntax for RDF which designed to be usable by even those ignorant of RDF.
- The physical location of the model. By default this is will be a text file, but you can configure Rhizome to use a variety of databases.
- In RDF, the resource that a property applies to.
- Unique Resource Identifier, aka URL (except it doesn't have to work in your browser).
- A name of a resource that is unique to the site and so can be used to link to that resource.
- An alternative syntax for XML that is designed to be relatively painless to author; similar to Wiki text formatting rules.
Rhizome should work on any platform that supports Python and 4Suite. See Download for details on supported platforms.
If Lupy is installed (http://www.divmod.org/Home/Projects/Lupy), Rhizome will perform full-text indexing of content (requires Python 2.3).
- a package named "rx" in the Python site-packages directory
- shell scripts or .bat files for running Raccoon and ZML in the Python scripts directory
- a directory named "rx4rdf" containing documentation, Rhizome pages, and other ancillary files in the Python share directory.
- cd <your application's home dir>
- <python script dir>/run-raccoon -a <python share dir>/rx4rdf/rhizome/rhizome-config.py
This will launch Raccoon's built-in http server, which runs on port 8000 by default. You can change this by editing server.cfg (see the Raccoon Manual for more information).
That's it! You now have a new instance of Rhizome running. By default, the site will be stored in the current directory in file called "wikistore.nt" and content will be stored in a subdirectory called "content". Most of the Rhizome's behavior is defined in its pages so you can start customizing your Rhizome instance by editing its pages and metadata from within Rhizome.
The first line sets the base URI that will be used for creating RDF resources. Note that this does not have to correspond to any resolvable URL. The second includes the Rhizome config file.
Next, you probably want to set administrator password and the secure hash seed. Rhizome creates a default administrator super-user that has full access to the system, and its login name and password both default to 'admin'. To set your own password, in your config file set ADMIN_PASSWORD before including the Rhizome config. If you don't want to store the password in cleartext in your config file, you can set ADMIN_PASSWORD_HASH instead, which requires a SHA1 digest of the concatenation of the password and the secure hash seed. In Python, you can calculate it like this: "import sha; sha.sha( 'yourpassword' + 'your secure hash seed').hexdigest()"
As illustrated above, the secure hash seed is used to generate a digest that can be publically displayed. To set your own private seed, set the SECURE_HASH_SEED setting in your config file before including the Rhizome config. If you don't set your own private seed, or if it is compromised, it will be much easier to mount a dictionary attack on the password hashes. Note that if you change its value, all previously generated password hashes stored in your model will no longer work.
Editing Content Items
Below is a guide to the form controls found on the edit and new pages:
- When creating content for the first time the Name edit box will be shown. Enter the name of the content here -- it should be a name unique to the site. If the name contains slashes (e.g. 'foo/bar') the content will placed in the folder based on name, creating the folder structure if necessary (e.g. content will be put in the folder named 'foo' and the content given the name 'bar'). If the "Anonymous" option is selected you do not need to give the content a name (internally it will be given a name based on the current time and date). If the "Anonymous" is selected and there is a name entered in the "Name" edit box, then the name will be treated as the folder path in which to place the anonymous content (creating the folder structure if necessary).
- Enter the title for the content (optional). Unlike the name it can be changed anytime and contain any type of characters.
- You create the item's content either by entering it in the editbox or by uploading a file.
- Source Format
- This dropdrown specifies the format of the content you are entering and drives how Rhizome will process the content. The following list describes how Rhizome will handle the different choices:
- ZML (the default choice) is a text format similar to the ones found in Wikis except that you can use it author arbitrary HTML or XML.
- Binary content is content such as an image or multimedia file. It is never tranformed or manipulated by Rhizome. Generally you'll want to set the Item Type (see below) to "Page".
- HTML and XML text will have any link that starts with 'site:' replaced with a live URL depending on the request context. Using 'site:' links allows links on a page to be work even if the page moves to another directory or transformed in other contexts (such as statically exported). The 'site:' URL scheme is similar to the 'file' scheme: URLs that start with 'site:///' will be relative to the root of where the Rhizome site is running from, while URLs that start only with 'site:' will be relative to the current location of the page.
- The contents is treated as Python code that is executed when the page is requested. Anything the code writes to stdout is captured and sent as the response. By default, the administrator must explicitly authorize each page of Python code or an Unauthorized error will occur when attempted to view the page.
- RxSLT is a XSLT sheetsheet that can create HTML or XML output using the site's model as the stylesheet's source.
- RxUpdate is an (enhanced) XUpdate document that is used to update the site's model. RxUpdate pages are typically used as form handlers (for example, 'save' page) and generally you'll want to set its Item Type to 'Handler'.
- Plain text. When included in HTML (for example, by the "Entry" template), it will be escaped.
- A XSLT stylesheet. It uses the current content of the request as the document source, so you'll need to set this page up as a template by adding a "wiki:handles-doctype" property. You'll also probably want to set the Item Type to "Template".
- NTriples is a plain text (as opposed to XML) based format for RDF.
- Turtle (Terse RDF Triple Language) (only if the Redland package is installed) is another plain text format for RDF.
- Minor Edit
- When you check this option you mark this revision of the content as being a minor edit. If you also edited the last revision of this item, then the current revision will replace the previous one. (This prevents cluttering the site up with revisions to just to fix typos, etc.)
- A comma or space-separated list of keywords. Keywords provide simple way to associate arbitrary metadata with the content item. Each keyword will be asociated with the content item via the wiki:about property. You can browse pages in the system grouped by keywords by choosing Browse in the page footer.
- Item Type
- The Item Type (internally called the Item Disposition) specifies how this content item should be used. It is used to select a template stylesheet that will be applied to the content. The following are included by default:
- This means that content is a stand-alone page and no template should be applied to it. For example, you probably want to set this for images.
- (the default). This frames the content in the default site-template.
- The content is used to update the site or handle a POST from a form page. Typically used for RxUpdate or Python pages that might not create any response output and for pages that handle write actions for other types of resources. The default handler template just prints the message "Completed <action name> of <resource name>!"
- RxML Template
- Indicates the content is text containing RxML that when invoked will bring up an edit metadata page containing the RxML. When saved, Rhizome will treate all the properties as new and replace any anonymous (aka blank) resources (those whose name starts with 'bnode:') with new anonymous resources. Using RxML templates provides a quick and dirty way of adding new types of resources to the sytem without having to build a form UI for creating and editing them.
- The content will be used as template (typically for XSLT content). Currently, this just has the effect of stopping any more templates from being applied to the content.
Advanced Options. Pressing the "More" button will reveal the following options:
- This dropdown lets you choose how the content can be accessed. Choices depend on rights the user has. The Guest user (the default if you are not logged in) only has one choice: "Public", which means anyone can view, edit, or delete the content. If you sign in as a user, you'll see choices that allow you to make content only readable and writable by you (and administrators), or public readable but only privately modifiable. You can also set the content to be readable and/or writable only to members of the user group you belong to.
- This dropdown shows all the labels in the system that you can choose to apply to this revision of the content. The default configuration of Rhizome has two labels, "Draft" and "Released". To have no label associated, choose the blank option. When viewing content, the last revisions with the "Released" label will be shown. If there is no revision labeled "Released" the last revision not labeled "Draft" is chosen. This behavior can be overridden by adding a label or revision parameter to the URL. By default, only administrators can set the "Released" label on a revision. Setting this enables a review process since all future revisions will need to be marked "Released" by an administrator before they can be viewed. Non-administrators can prevent content they are working on from being displayed by marking it "Draft".
- This checkbox (default: on) indicates whether or not Rhizome should try to extract metadata from the content when saving it.
- Change Comment
- An optional comment to be associated with your revision.
Assigning metadata and properties to the content
After you save your changes, you can add or modify arbitary properties and metadata for the content item and its latest revision. To do so, after saving choose the metadata link, which will display the metadata view of the content. Then choose the "Edit Metadata" at the top of the metadata view page.
The edit metadata page presents the content's resources as editable RxML. You will also see links that let you edit the metadata in two other RDF formats, RDF/XML and Ntriples. For an example, click here to edit the metadata of this page.
In addition to adding arbitrary metadata there are a few properties that affect Rhizome's behavior that are not managed by the edit page. Some of these include:
- This property can be assigned to any resource that uses the wiki:name property and lets you give a resource an alternative name. Its value is identical to the value of a wiki:name: it may be a hierarchical path that corresponds to its folder structure and the name should be unique across the site.
- When creating a template, use this property to indicate that content items that have the specified item disposition should have this template applied to it. Its value is a Item Disposition resource.
- When creating a template, use this property to indicate that content items that have the specified document type should have this template applied to it. Its value is a DocType resource.
- This property, along with wiki:action-for-type, indicates that this resource should be used to handle the specified action.
- Used along side wiki:handles-action to specify the types of resources this action handler should apply. Its value should be a resource type or rdfs:Resource (action handlers with the latter have lower priority than ones that specify a specific type).
Creating and Editing Other Resources
You can browse and edit all the types of resources in the system from the administration page (click on the "Admin" link). Unless a resource type has an edit handler associated with it the default edit handler will be used, which lets you directly edit the resource as RxML (and other RDF formats). You can designate a resource as an edit handler for a given type of resource by assigning it the wiki:handles-action and wiki:action-for-type properties, as described above.
There are a few useful properties that can be assigned to any resource:
- Any resource that has this property can be addressed with an URL, just like the item content name described above. Names should unique across the site, with the exception that you can also have a content item with the same name (and will be invoked instead).
- See above.
- Many of Rhizome's pages check for this property when trying to display a human-readable name for a resource.
When you request an URL from Rhizome, it translates the URL to an internal resource, applies an action to the resource, and returns the result of that action to your browser. The default action is 'view' but you specify other actions by adding an 'action' parameter to a URL (or, when creating an html form, by adding an 'action' form variable) -- for example, look at the links such as 'Edit' or 'Revisions' at bottom of the default template.
URLs are mapped to resources through a series of steps:
- Find the source resource. Rhizome applies the following queries, stopping at the first match:
- If the URL has an 'about' parameter, find the resource whose URI matches the parameter's value.
- Look for a resource that has a wiki:name property that matches the URL's path.
- Look for a resource that has a wiki:alias property that matches the URL's path.
- Look in the directories in Rhizome's PATH setting for a file that matches the URL's path. If one is found none of the following steps apply and the contents of the file is returned immediately.
- Look for a resource that has a wiki:name property named '_not_found'. By default, this will be a page that prompts you to create a new page using the missing name.
- See if the user is authorized to perform the action associated with this URL on the selected resource. If not, set the resource to auth:Unauthorized.
- Look for a resource that can handle the action associated with this URL when applied to the selected resource.
- If the action is 'view' and resource is of type 'NamedContent', choose the same resource.
- Search for a resource with a wiki:handles-action property equal to the action and a wiki:action-for-type property equal to the type of the current resource or 'rdfs:Resource'.
- Look for a resource that has a wiki:name property named 'default-resource-viewer'. The default view displays the resource as RDF (in RxML or other RDF formats).
- At this point the current resource should alway be of type NamedContent. Now find a revision associated with that resource (through the wiki:revisions property) that contains the resource's content:
- If the URL had a revision parameter, choose the revision that matches that number
- If the URL had a label parameter, choose the last revision that has a matching label
- Choose the last revision that has a label with the wiki:is-released property
- Choose the last revision
- Now we retrieve the content associated with the revision (which can be either stored in external files or within the model, depending on how you configured Rhizome) and process it based on the content transform metadata associated with it (this is set by the 'Source Format' dropdown on the edit page).
- At this point we may be ready to return the contents to the browser or we might want to apply a template to the results. To find out we look for a template resource that first matches the following queries:
- Search for a resource with a wiki:handles-doctype property equal to to the URL's '_doctype' parameter, if present, or the revision's wiki:doctype property. This allows for stylesheets that can convert content from one doctype to another.
- Search for a resource with a wiki:handles-disposition property equal to the URL's '_disposition' parameter, if present, or the revision's wiki:item-disposition property. Item disposition's corresponds to the "Item type" drop-down on the Edit page and allow for template stylesheets to control how content is rendered.
- If we do find a template resource, we stuff the content of the current request into a XPath variable called '$_contents' and then, starting with the fourth step, invoke the above steps again using template resource. Note that this means that another template can be invoked on the results of this template resource and so on and so on.
Access control (or authorization) provides control over which users can access or modify particular resources and functionality. This is particularly important with Rhizome because most of its structure and behavior is editable by users; indeed, the user can write pages in Python and execute arbitrary code. However, in many applications you don't want to have to think about access control, so Rhizome provides an access control model that protects the core structure by default but doesn't require extra work if you're not interested in extra control.
The basic notion is an Access Token that can be attached to (guard) any resource. Access tokens have one or more permissions associated with them, each of which grant the right to perform a specific action. A user that has rights to a particular Access Token can perform the granted actions on all the resources that the Access Token guards. In addition, Access Tokens have a priority property, enabling more important Access Tokens to override less important ones.
One benefit with this approach is how scalable it is in terms of the complexity of your authorization scheme. If you don't care about authorization at all you just can ignore it -- it only has an effect when you attach access tokens to guard resources. But you can create complicated authorization schemes by specifying your own authorization RxPath expressions by setting the authorization expression variables found in the config file, enabling you to create complex path traversals (e.g. inheritance, hierarchical groups, etc.) by redefining how a resource finds the Access Token that guards it or which Access Token a user has rights to.
With Rhizome's default authorization expression, user have rights to a token either directly via the auth:has-rights-to property or as a member of a Role (via auth:has-role) which in turn has rights to tokens via auth:has-rights-to. Similarly, resources are guarded by access tokens either directly via the auth:guarded-by property or indirectly by the access-tokens that guard their class resources.
The basic subject (or principle) of access control in Rhizome is represented by the foaf:OnlineAccount class; accounts can be members of a Role, which give the account rights to all the AccessTokens that the role has rights to. Accounts are almost always associated with a user (via the foaf:holdsAccount property); and users are represented by the foaf:Person class. However, Rhizome has two built-in accounts that are not associated with any particular user: The 'admin' account is assigned the 'super-user' role, which is a special role that has full access to the system regardless of access tokens. The 'guest' account is a member of the 'guest' role; anyone not logged-in treated as a 'guest'. By default, the 'guest' account can only perform actions that don't require any access token.
The Create/Edit User page allows you to choose which roles the user belongs, but the UI for this is only displayed when the current user (the one viewing the page) has the right to assign roles (this is determined by the auth:can-assign-role property), and only those roles that user can assign will be displayed.
When a new user is created, several new access tokens are created for the user, allowing the user to set different levels of privacy for the content she edits (to learn more, take a look at signup-handler.xml). When you create or edit a page you'll notice a dropdown box labeled "Sharing". This drop-down contains all the access tokens the user has rights to (if you're not logged-in, you'll see just see the default "Public Read/Write" option because the guest user doesn't have rights to any access tokens). When you save the page it will now require your chosen access token.
Notice that two of the "Sharing" options are labeled "Group" and "Group Write/Public Read" -- choosing these access tokens limit access to only those users who share membership in any of the User Groups you are a member of. A user group is a subclass of role which has the user's group access tokens added to the role when the user joins. This way any other user that shares membership with that user's user groups can access resources that are guarded by one of that user's group access tokens.
In addition, Rhizome provides a simple access control scheme for modifying a request metadata via the assign-metadata and remove-metadata XPath extension functions: Any variable whose name that starts with two leading underscores ("__") is considered read-only and can not be assigned or removed. Similarly, external requests (e.g. form variables) cannot contain variables that start with two underscores (this is to prevent external variables from overriding the protected variables).
create-nospam-token is granted to the default user role, so only the Guest user will be subjected to these spam checks. For more stringent spam control, remove this token from the default user role.
Rhizome also provides following security features that may require configuration. For more information on the settings mentioned below, see the config file documentation.
- Python pages will only be executed if the SHA1 digest of the page is listed in the authorizationDigests config variable.
- By default, URL resolution is limited to only access local file system paths specified by the PATH config variable. These limitations can be relaxed in various ways through the following config variables: SECURE_FILE_ACCESS, DEFAULT_URI_SCHEMES, uriResolveWhitelist, uriResolveBlacklist.
- The authorization and validation of XPath extension functions calls can be controlled by the authorizedExtFunctions config variable.
- Rhizome utilizes other security features provided by Raccoon which can customized by overriding various configuration settings (for example, the authorization of Content Processors and request metadata). See rhizome-config.py and the Raccoon manual for more information.
There are three ways you could modify these files:
- Just edit them in Rhizome (e.g. click on the links above)
- Place a modified copy of the file in a directory that the appears in the Raccoon PATH before the 'rhizome' directory. By default, Rhizome places a directory named 'content' (relative to the current working directory) before the rhizome directory. (See RaccoonConfig for more on the PATH setting).
- Modify the files directly using any tool.
You can create a new theme by creating a new wiki:SiteTheme resource. This resource must refer to the theme's xsl and css pages (see the create SiteTheme template). You can set which theme Rhizome uses by setting the wiki:uses-theme property associated with the site-template. If a theme refers to external files you it may be convenient to place those files in a separate theme directory and place that directory on the Raccoon PATH. If you haven't explicitly set the PATH, you can use the THEME_DIR config setting to include it in the PATH.
Rhizome ships with a few themes, see the rhizome/themes directory for the ones currently included.
__addItem__('FOAFPaper',loc='path:.rzvs/FOAFPaper.zml', format='zml', title="Rhizome Position Paper", accessTokens=['base:save-only-token'], disposition='entry', doctype='document') __addRxML__(replace = '@sitevars', contents = ''' base:site-template: wiki:header-image: `Rx4RDFlogo.gif wiki:header-text: ` wiki:footer-text: `© 2005 Liminal Systems All Rights Reserved wiki:uses-theme: base:default-theme wiki:uses-skin: base:light-blue.css ''')
Importing content and schema migration
The import command adds content to the site by adding all the files that match the given path (with wildcards).
The command has these options:
--import path [--recurse] [--filenameonly] [--dest path] [--xupdate url] [--folder path] [--noindex] [--noshred] [--format format] [--doctype doctype] [--disposition disposition] [--token accesstoken] [--label label] [--keyword keyword]where
path Location of files to import (* and ? wildcards ok)
--recurse if present, recursively import subdirectories
--filenameonly (with recurse) don't include the relative path in the imported item name
--dest dir If dest is present files will be copied to this directory, otherwise the site will directly reference the imported files.
--xupdate URL URL to an RxUpdate file which is applied to each metarx file if present.
--noindex Don't add imported content to the full-text index
--noshred Don't shred content on import. The following are applied to a file if no metarx file is present:
--folder path Prepends path to name and creates folder resource if necessary
--keepext Don't drop the file extension when naming the page.
--format URI or QName default wiki:item-format value
--disposition URI or QName default wiki:item-disposition value
--doctype URI or QName default wiki:doctype value
--label URI or QName default wiki:has-label value
--token URI or QName Access Token to guard (auth:guarded-by) the content
--keyword URI or QName default wiki:about value
If, for each file, there exists a matching file with ".metarx" appended, then import will attempt to add the metadata in the metarx file. First it loads the metadata file and then updates it using the RxUpdate file specified by the --xupdate option. If it isn't present it will run the default, "path:import.xml". This RxUpdate script disgards previous revisions and points the content to the new import location. You'll find a couple of other import scripts in the rhizome directory: one changes the base URI for the RDF resources, and another adds authorization tokens to the items. By writing your own import scripts and exporting and reimporting your site, you can migrate your site to your latest schema.
If no .metarx file exists import will use the defaults specified by the --format, --disposition, --doctype, etc. options, if present. Their value can be either an URI or a QName. If a default is not specified, for required metadata import will attempt to guess at the metadata based on the contents and file extension of the imported file.
Exporting and static websites
You can use Rhizome's export command to export the content of each item in the site as a separate file. You can use it for two different types of tasks: to export raw content and metadata (for data exchange or schema migration, etc.), or to generate static versions of the website.
The command has these options:
--export dir [--static [--noalias] | --astriples filepath] [--xpath exp | --name name] [--label label]where
dir is the directory to export to
-xpath RxPath expression that evaluates to a nodeset of items to export
-name The name of item to export (for exporting one item) (no effect if -xpath is specified)
-label Choose the last revision that matches this label
-static Export as static HTML
-noalias Don't create static copies of page aliases (use with -static only)
-base Base URL for links (use with -static only)
-astriples filepath Export the selected resources as an NTriples file. External content is inserted into the file as string literals.
If the -static option is present, export will try to render each item as HTML. Dynamic pages (e.g. those that require query parameters) are skipped (you may see exceptions being reported to the consol). A $_static variable is introduced so stylesheets can render appropriately (for example, see site-template.xsl). Limitations:
- External content (images, etc.) referenced by links are not copied
- It would be nice if pages without extensions were renamed with an extension. This is not yet supported.
If the -static or -astriples options are not present each item will be exported to 2 files: one containing the raw content of the item and the other a RxML document containing the metadata associated with the item. The first file will match the item's name with a file type extension added, if necessary, and the second will be the same but with '.metarx' appended.