Zebra - User's Guide and Reference
Prev	Chapter 1. Introduction	Next

Features

This is an overview of some of Zebra's most important features:

Very large databases: files for indexes, etc. can be automatically partitioned over multiple disks.
Arbitrarily complex records. The internal data format is an structured format conceptually similar to XML or GRS-1, which allows lists, nested structured data elements and variant forms of data.
Robust updating - records can be added and deleted ``on the fly'' without rebuilding the index from scratch. Records can be safely updated even while users are accessing the server. The update procedure is tolerant to crashes or hard interrupts during database updating - data can be reconstructed following a crash.
Configurable to understand many input formats. A system of input filters driven by regular expressions allows most ASCII-based data formats to be easily processed. SGML, XML, ISO2709 (MARC), and raw text are also supported.
Searching supports a powerful combination of boolean queries as well as relevance-ranking (free-text) queries. Truncation, masking, full regular expression matching and "approximate matching" (eg. spelling mistakes) are all handled.
Index-only databases: data can be, and usually is, imported into Zebra's own storage, but Zebra can also refer to external files, building and maintaining indexes of "live" collections.
Zebra is written in portable C, so it runs on most Unix-like systems as well as Windows NT. A binary distribution for Windows NT is available at http://ftp.indexdata.dk/pub/zebra/win32/, and pre-built packages are available for some Linux distributions: Red Hat 7.x RPMs at http://ftp.indexdata.dk/pub/zebra/RedHat7.X/ and Debian packages at http://ftp.indexdata.dk/pub/zebra/debian/

Z39.50 protocol support:

Protocol facilities: Init, Search, Present (retrieval), Segmentation (support for very large records), Delete, Scan (index browsing), Sort, Close and support for the ``update'' Extended Service to add or replace an existing XML record.
Piggy-backed presents are honored in the search request - that is, a subset of the found records can be returned directly with a search response, enabling search and retrieval to happen in a single round-trip.
Named result sets are supported.
Easily configured to support different application profiles, with tables for attribute sets, tag sets, and abstract syntaxes. Additional tables control facilities such as element mappings to different schema (eg., GILS-to-USMARC).
Complex composition specifications using Espec-1 (partial support). Element sets are defined using the Espec-1 capability, and are specified in configuration files as simple element requests (and, optionally, variant requests).
Multiple record syntaxes for data retrieval: GRS-1, SUTRS, XML, ISO2709 (MARC), etc. Records can be mapped between record syntaxes and schemas on the fly.