Customizing collections

Customizing collections

From GreenstoneWiki

How can I customize the appearance of my Greenstone collection?

Since version 2.63, Greenstone has moved from a table-based layout to using CSS (Cascading Style Sheets). The default stylesheet is specified in the style.css file in the "images" directory of your installation, and any change you make there will affect all collections, as well as the home page.

If you want to specify a different style file for a collection, you need to overwrite the "cssfilelink" macro from the style.dm macro file for your collection. There are two ways to do this:

1. Modify (or create) the collection-specific macro file called "extra.dm" in the collection's "macro" directory. (You may need to create this directory). In this file, put the following lines:

 package Style
 _cssfilelink_ {/url/to/style/file.css}

2. Since greenstone 2.63, you can set macros on a per-collection basis in the collect.cfg file, with a line such as:

 collectionmacro Style:cssfilelink "/url/to/style/file.css"

For the URL, you can use greenstone's "_httpimg_" macro to refer to the installation's image directory, or "_httpcollection_" to refer to the collection directory. This means that you can create a directory called "styles" in your collection's directory, put a file in there called "mystyle.css", and set the macro to point to "_httpcollection_/styles/mystyle.css".

If you override the Style:cssfilelink macro, greenstone will still set the background image for several of the HTML page's elements. To completely override all the default style-related settings, you need to specify your own HTML for the "Style:cssheader" macro. For example, in the collect.cfg file:

 collectionmacro Style:cssheader '
  <link rel="stylesheet" href="_httpcollection_/styles/mystyle.css"
    type="text/css" title="My Style" charset="UTF-8" media="screen">
  <link rel="alternate stylesheet" href="_httpimg_/style.css" type="text/css" 
    title="Default Greenstone Style" charset="UTF-8">
 '

An example collection demonstrating how the appearance of a collection can be modified just by changing the CSS is online at http://puka.cs.waikato.ac.nz/cgi-bin/library?a=p&c=style-e&p=about ; you can copy the stylesheets used in that collection - blue and red.

What are the formatting options available for my collection?

Section 2.3 of the Greenstone Developer's Guide discusses how to format the output of your collection. However, the list of options is incomplete. The full list of formatting options is shown here. But for more information about how to use these options, the developer's guide is the place to go.

How can I hyperlink individual metadata elements?

[contributed by Axel Schild] When a metadata element has only one value, it is easy to make a hyperlink out of the value. In the format statement, you just put an <a> tag around the metadata item, for example:

<a href="url to link to">[dc.Subject]</a>

When the metadata item has multiple values, and you want to link each one separately, it is a bit more difficult. The following is Axel's solution to his particular problem: display all the Creator elements, each one hyperlinked to a search of that Creator in the Creators index.

Use the format string below in the collect.cfg file (in this case, as part of the "format DocumentText" statement)

{If}{[dc.Creator],
<tr>
<td align=right valign=top><b>Authors:</b></td>
<td align=left valign=bottom><label name=AuthorField id=AuthorField>
_httpquery_;[cgisafe:sibling(All:\\' ;\\'):dc.Creator];[sibling(All:\\'_\\'):dc.Creator]
</label></td>
</tr>}

This statement includes a label definition with the name "AuthorField". "_httpquery_" is a macro which resolves into the http-address of the query page of the collection. "[cgisafe:sibling(All:\\' ; \\'):dc.Creator]" displays all Creators, separated by ; and with any special characters escaped for use within a web address. [sibling(All:\\'_\\'):dc.Creator] produces a similar string without escaping the special characters. Notice the different separation symbols, these are needed later on.

Additional changes have to be made in order to make this whole thing work. You further need to change the _header_ or _textheader_ macro in the package of the page the format string will be displayed in (in this case the document package). The change is that _htmlhead_ has to be parametrized with _htmlhead_(onload="ExtractAuthors();"), where ExtractAuthors(); is a Javascript function that is called on loading the corresponding page (the document display page). Since you do not want to mess in the standard macro files, create an extra.dm file (in gsdl/collect/<collname>/macros) and override the chosen macro with a collection specific macro. In this example this is done by the code sequence

package document

###document display

###HTML-Page Header
_textheader_ [c=exacol] {_cgihead_
_htmlhead_(onload="ExtractSubjects();ExtractAuthors();")
<center>
<table width=537><tr><td align=right>
_icontab__javalinks_</td></tr></table>
</center>
}

Now all that is missing is the Javascript function which has to be included into the _pagescriptextra_ macro of the same package. Copy this macro out of the corresponding standard macro file and paste it into your extra.dm file. Make the necessary modification which is in this case

### Self-made Javascript functions
_pagescriptextra_{
function ExtractAuthors() \\{
 var res;
 a = AuthorField.outerText.split(";");
 resolver = a[0]+"&q=";
 b = a[1].split("+%3b+");
 c = a[2].split("_");
 res = "";
 for (i = 0; i < b.length ;i++)
 \\{
 res = res + "<a href=" + resolver + b[i]+ "&h=dd0&t=0>" + c[i] + "</a><br/>";
 \\}
 AuthorField.outerHTML = res;
 \\}
}

This Javascript function evaluates the string of the defined label, splits it into several strings and composes a string out of those values, which is then set to the "outerHTML" element of the label. "&h=dd0" indicates which index to search in; dd0 should be replaced with the name of the appropriate index. The file gsdl/collect/<collname>/index/build.cfg gives the names of the various indexes.

[update from Katherine Don] I have just tried this with version 2.72, and I needed to make a few changes. Here are my versions of the format statement and extra.dm file (this worked in Mozilla):

Format statement:

{If}{[dc.Creator],
<tr>
<td align=right valign=top><b>Authors:</b></td>
<td align=left valign=bottom><label name=AuthorField id=AuthorField>
_httpquery_:[cgisafe:sibling(All:\\':\\'):dc.Creator]:[sibling(All:\\'_\\'):dc.Creator]
</label></td>
</tr>}

Extra.dm:

package document

# header overridden for text pages
_textheader_ {_cgihead_
_htmlhead_(onload="ExtractAuthors();")
_startspacer_

<!-- document:textheader -->
<div id="banner">
<div class="pageinfo"><p class="bannerlinks">_globallinks_</p></div>
<div class="collectimage">_imagecollection_</div>
</div>
<div class="bannerextra">_pagebannerextra_</div>
}

### Self-made Javascript functions
_pagescriptextra_{
function ExtractAuthors() \\{
 var res;
 var author = document.getElementById("AuthorField");
 a = author.innerHTML.split(":");
 resolver = a[0]+"&q=";
 b = a[1].split("%3a%5c");
 c = a[2].split("_");
 res = "";
 for (i = 0; i < b.length ;i++)
 \\{
 res = res + "<a href=" + resolver + b[i]+ "&h=dd0&t=0>" + c[i] + "</a> ";
 \\}
 author.innerHTML = res;
\\}
}

How can I hide the dummy text "This document has no text"?

Instead of [Text] in the DocumentText format statement, use

{If}{[Text] ne 'This document has no text. ',[Text]}

If you have installed Greenstone in a different language, then you need to put the correct language string into the If statement. (Since version 2.62.)

How do I suppress the link to the Greenstone text version of a document?

Files such as Word and PDF get converted to HTML during processing, and the original file is stored as an associated file. The default display for search results and browsing lists is two icons: a "text" icon, linking to the Greenstone version of the document, and a pdf or word icon, linking to the original file.

In format statements,

[link][icon][/link] links to the greenstone version
[srclink][srcicon][/srclink] links to the original file

The default VList format statement starts off like

<td valign=top>[link][icon][/link]</td>
<td valign=top>[srclink][srcicon][/srclink]</td>

To suppress either of the icons, you can delete the relevent line from the format statement.

If you are suppressing the link to the original, then remove

<td valign=top>[srclink][srcicon][/srclink]</td>

from the format, and you are done.

If you want to suppress the link to the greenstone version, there are a few complications. Firstly, bookshelf nodes in classifiers must have [link][icon][/link] to display the bookshelf. And secondly, you may want to display different icons for different document types.

Here are a few scenarios, and the appropriate format to replace the two lines specified above. (For PDF documents, you can substitute any that get converted)

  • Collection with only PDF documents, and no bookshelves in classifiers:

<td valign=top>[link][icon][/link]</td>

  • Collection with only PDF documents, and bookshelves in classifiers:

<td valign=top>{If}{[numleafdocs],[link][icon][/link]
,[srclink][srcicon][/srclink]}</td>

  • Collection with PDFs for which you want to suppress the text link, and other types where you want to show the text link, and classifiers with bookshelves. The first option will only work if the other document types don't have srcicon metadata.

<td valign=top>{If}{[srcicon],[srclink][srcicon][/srclink], [link][icon][/link]}</td>

<td valign=top>{If}{[FileFormat] eq "PDF",[srclink][srcicon][/srclink],[link][icon][/link]}</td>

How do I add cover images for my documents?

To add a cover image for a document, you need to create the image and add it to the collection in the same folder as the document. It must be a JPEG file (with file extension .jpg) and have the same name as the document. For example, the cover image for farming.doc must be named farming.jpg. Greenstone automatically looks for jpg files of the same name to assign as cover images. (To disable this, you need to add the -no_cover_image option to the plugins.)

Once you have the cover images, setting the format statement DocumentImages to true (or enabled in GLI) will result in them being displayed on the document page.

You can also use them as thumbnails in search results and browsing lists, by replacing [icon] with something like the following in the appropriate format statement:

<a href='[DocImage]' height='40'/>

How do I link to other sections of my document?

The _httpdocument_ macro is used to provide a link to a document. For example, if xxx is a document identifier, then the following would link to it:

<a href="_httpdocument_&d=xxx">...</a>

HASH0167e3981ac4d7f8353b186b gives the OID of the current section, while HASH0167e3981ac4d7f8353b186b gives the top level OID of the document. For documents without internal sections, these two format elements return the same Identifier. (Note, HASH0167e3981ac4d7f8353b186b available from version 2.72)

So, _httpdocument&d=HASH0167e3981ac4d7f8353b186b links to the current section of the document, and _httpdocument_&d=HASH0167e3981ac4d7f8353b186b links to the top section of the document.

Document identifiers can be used in conjunction with some modifiers to get different parts of the document. Modifiers include:

  • .pr parent
  • .rt root (top level) (from 2.72)
  • .ns next sibling
  • .ps previous sibling
  • .fc first child
  • .pc previous child

Here are some examples:

IdentifierDescription
HASH0167e3981ac4d7f8353b186b or HASH0167e3981ac4d7f8353b186b.rt The top level section of the document
HASH0167e3981ac4d7f8353b186b.ns The next section of the document (used for next arrow)
HASH0167e3981ac4d7f8353b186b.ps The previous section of the document (used for prev arrow)
HASH0167e3981ac4d7f8353b186b.fc or HASH0167e3981ac4d7f8353b186b.1 The first subsection of the document
HASH0167e3981ac4d7f8353b186b.1.2 The second subsection of the first section of the document

How do I link to the next or previous search result in a document page?

Since Greenstone 2.72

Important Note: This option should not be used due to a bug in this release. It will be fixed in the next release.

Enable the links

  • In GLI -> Format Panel -> Format Features -> Choose Feature
    • select DocumentSearchResultLinks
    • click the Add Format button
    • tick the Enabled checkbox

Or

  • In your collection's collect.cfg (in GSDLHOME/collect/your_collection/etc)
    • add
format DocumentSearchResultLinks true

Format the links

  • In document.dm (in GSDLHOME/macros)
    • the next and previous search result link are defined in _nextsearchresult_ and _prevsearchresult_ correspondingly, you can put them wherever you want in a document page and customize their apprearance in style.css (in GSDLHOME/images)
    • the text for those links are defined in _textnextsearchresult_ and _textprevsearchresult_ in english.dm. Add your corresponding language entries in your_language.dm for languages other than English

How do I search multiple collections at the same time?

You can do this using Greenstone's cross collection search facility. In the GLI, go to Cross-collection search on the format panel, and select the collections you want to search together. These collections should have the same index specifications for this to work properly. On the preferences page for each collection the user will be given the option to select which collections should be searched. Individual search results will be formatted specific to the collection the result comes from. If you want all the search results to use the format statement of the collection the user is based in, then add the following line to the collect.cfg file:

supercollectionoptions uniform_search_results_formatting

How to handle diacritics

MGPP comes with an "accent folding" option. This is implemented similarly to case folding. For case folding, A and a match. In accent folding, à and a match. If an accent folding index is built, then the user will be able to turn it on in preferences, just like casefolding and stemming.