fmII
Tue, Oct 14th home | browse | articles | contact | chat | submit | faq | newsletter | about | stats | scoop 18:31 UTC
in
Section
login «
register «
recover password «
[Article] add comment [Article]

 Non-SQL Databases for Linux
 by Christopher Browne, in Category Reviews - Sat, Oct 6th 2001 00:00 UTC

The best-known databases these days are based on SQL, but are often overkill for what you need to do. This review discusses lighterweight alternatives, including xBase, DBM, and ISAM systems.


Copyright notice: All reader-contributed material on freshmeat.net is the property and responsibility of its author; for reprint rights, please contact the author directly.

xBase Descendants

Back in the 1970s, a database system was released for the CP/M operating system called Vulcan. Ashton Tate bought the rights to it, and renamed it dBase. It went through a number of iterations and version numbers, and a number of companies produced "clones".

The architecture was interesting; it combines these features:

  • Each table is represented as a file, as is each index.
  • There is a "table browser" that allows viewing and editing a table.
  • There is a "form builder" that allows you to fill in fields based on application control that might involve multiple tables.
  • There is a somewhat BASIC-like language for constructing reports based on querying tables.

Using all of this, you can build quite powerful interactive applications to work with a set of database tables.

In the early days, the tools for building "forms" were somewhat primitive (they got more sophisticated over time). In those days, personal computers were not connected to networks, so programs were inherently single-user-oriented. When PCs started getting networked together, locking schemes were introduced to allow use by multiple concurrent users.

Much later, an ANSI committee, X319J, created a common "dialect" called xBase, trying to unify the functionality of notable commercial implementations such as dBase, Clipper, and FoxBase, with the result that it is typical to call these sorts of database systems "xBase systems".

Early implementations functioned as file-based systems, that is, programs accessed data through the OS filesystem, in which each table and index is represented as a file. This approach doesn't scale very well from the perspectives of either reliability (because any program on the system that accesses a table file has the ability to "muss it up") or scalability (because every program that accesses the database has to manage locks itself).

Some implementations have become available that have a central database manager process, as is common with SQL databases. Furthermore, SQL interpreters have sometimes been added to the set of tools, so "xBase systems" are sometimes really SQL systems.

Various xBase implementations are available for Linux:

  • FlagShip -- essentially a Clipper "clone"; it compiles dBase III+ (and higher) code, and reads and writes related file formats. Free Personal FlagShip is an unlimited 2-user version similar to the commercial Personal license, but intended strictly for personal use or for development of database applications distributed for free (whether Public Domain or Open Source).
  • The Harbour Project -- building a Clipper "clone" for DOS and Unix that is freely redistributable.
  • PlugSys International
  • Recital
  • CodeBase
  • DBF to other formats conversion software
  • X2c -- a portable xBase compiler.
  • Xbase -- a collection of specifications, programs, utilities and a C++ class library for manipulating xBase type datafiles and indices.
  • XBSQL -- a wrapper library providing an SQL-like interface to the Xbase DBMS.

General xBase Documentation

Keyed Table Systems like DBM

At the "lowest level", there are quite a number of data storage systems that don't try to be terribly abstract, or to provide a complete "application environment". The characteristic example is the Unix DBM scheme, which provides a set of C function calls that allow you to store values -- associated with keys -- into a data file.

If your data storage needs are simple, it may not make sense to pull in the full sophistication of an SQL system. Furthermore, the "serious database" systems typically tend to require some administration effort, often including setting up server processes, user authentication, and the likes. If you have an application that merely needs to "store some data", a DBM-like system may be all you need.

Some SQL database systems are (or have been) based on these sorts of libraries. For instance, Informix implemented a C-based ISAM library that was often embedded in applications; the Informix SE SQL database system was implemented on top of that, as tables were represented as ISAM tables.

There are a couple of SQL databases that have been built atop DBM. One of the most interesting examples is that "transactions" were brought to MySQL when they attached it to Berkeley DB, a modern version of DBM that supports transactions and storage of multiple "tables" within a single data file.

The major families of these databases include:

DBM-like databases that allow storing "associative arrays" on disk.
These are usually thought of as involving hash tables, but sometimes use B-Trees.
ISAM databases
ISAM stands for "Indexed Sequential Access Method", an indexing system that allows rapidly seeking to appropriate locations in a data file. Since data is stored in sequential order, efficiency of use of disk space is generally quite good.

These systems tend to be highly API-oriented; while an SQL database often provides a lot of generic tools for building queries, and you tend to describe your query, these sorts of databases almost always require writing programs to "walk" through the data.

DBM-Like Databases

Linux systems almost always include some set of NDBM, the "New" DBM implementation, SDBM, ODBM, and GDBM, the "GNU" implementation.

The typical API looks like:

DBM *dbm_open(char *, int, int);
void dbm_close(DBM *);
datum dbm_fetch(DBM *, datum);
datum dbm_firstkey(DBM *);
datum dbm_nextkey(DBM *);
int dbm_delete(DBM *, datum);
int dbm_store(DBM *, datum, datum, int);

The Perl language popularized the idea of tying DBM tables to Perl associative arrays, with the result that in Perl, once you tie a name to a DBM file, you can transparently use ordinary assignments like $A["this"] = "that", rather than something like dbm_store(A, "this", "that", 4).

The Python AnyDBM_File documentation page describes some of the similarities and differences between different DBM implementations.

  • The Berkeley DB Package, probably the most sophisticated such system, offers the ability for multiple hosts to access a database, multiple storage schemes (e.g., hash tables, B-Trees), distributed locking, and other pretty neat stuff.
  • cdb, the "constant" database, is quite interesting; it does not cope well with updates, but provides extremely fast access to static data. The canonical use of it is as a way of storing mail routing information for the qmail mail server.
  • rdbm (a reliable database) layers a DBM-like interface on top of cdb.
  • bun (bundle many files together), based on the cdb format, provides something like tar, implemented atop a (tiny) database system.
  • Dx

ISAM Databases

These systems are often embedded inside applications.

Notice that many of the vendors of ISAM-like systems also sell SQL databases; once you've got the low level library to store and retrieve data, it is pretty natural to build further layers of abstraction on top of that, such as SQL interpreters. That brings us full circle back to SQL, where we started.


Author's bio:

During his University years, Christopher Browne was employed by three public accounting firms as a student in accounts, preparing many sets of tax returns and financial statements. As a result, he decided he definitely didn't want to be an accountant or an auditor, but he's used his knowledge to pursue a career in programming financial systems. He has done more writing in recent years, publishing several articles and co-authoring the book Professional Linux Programming.


T-Shirts and Fame!

We're eager to find people interested in writing articles on software-related topics. We're flexible on length, style, and topic, so long as you know what you're talking about and back up your opinions with facts. Anyone who writes an article gets a t-shirt from ThinkGeek in addition to 15 minutes of fame. If you think you'd like to try your hand at it, let jeff.covey@freshmeat.net know what you'd like to write about.

[Comments are disabled]

 Referenced categories

Topic :: Database
Topic :: Database :: Database Engines/Servers
Topic :: Database :: Front-Ends

 Referenced projects

Berkeley DB - Provides embedded database support for traditional and client/server application
c-tree Plus - A high-performance data management system.
cdb - A package for creating and reading constant databases.
CodeBase Database Programming Tools for Developers - A high speed xBASE-compatible database engine.
FlagShip - A database system for moving xBase-based languages to Unix.
Recital Terminal Developer - An RDBMS with complete development tools and 4GL.
The Harbour Project - An open source, cross platform xbase compiler
X2c - A tool to compile XBase programs into compiled C programs.
Xbase - An xBase-compatible C++ class library.
XBSQL - A simple SQL wrapper for Xbase.

 Comments

[»] hamsterdb
by cruppstahl - Jul 18th 2007 05:31:49

hamsterdb's first submission was in September 2006. It's a DBM-like library written in ANSI-C concentrating on high performance. It can run as in-memory database, use memory-mapped I/O, supports database cursors, variable length keys and records and can handle multiple databases per file.

http://freshmeat.net/projects/hamsterdb/

[reply] [top]


[»] Heard about tdbengine yet?
by Thomas F - Mar 1st 2004 13:12:58

Another possible non-SQL rdbms for linux (and windows) can be found here :

http://www.tdbengine.org
It's a freeware/open source rdbms which is very compact, has a big feature list and of course is not queried by using sql. It has its own script language.
Perhaps you like it

[reply] [top]


[»] MVDB - Multi-Value Systems
by Michael T. Babcock - Nov 23rd 2001 11:50:07

You might also want to check out how multi-value systems work. See http://www.rainingdata.com/ for the Pick database system or http://www.jbase.com/ for another implementation that runs well under Linux.

See also http://hometown.aol.com/mbtpublish/index.html for terms definitions w.r.t. MVDBs and http://linas.org/linux/db.html for a better list of database systems for Linux.

These all include a Basic-like language that allows you to write quick and/or complex programs that directly access the data structures in the database system. Database files are all files within a somewhat hierarchial structure.

[reply] [top]


    [»] Re: MVDB - Multi-Value Systems
    by Pete Jewell - Mar 8th 2002 18:13:55

    Not only, but also, a developing Open Source implementation of a multivalue database system - MaVerick.

    [reply] [top]


[»] Object Databases
by quiper - Nov 22nd 2001 05:09:45

Ever heard of object databases?

These are able to model the world much better than
"classical" databases. We are now trying some
experiments on top of Quiper DB (Quiper DB is a
relational/hierarchical data store; see quiper
project in FM or quiper.zapwerk.com) for binding
it to Java and store Java objects directly in DB
(including references between objects). Side
effect is that no structure infos are necessary
and one can later change Java classes to contain
more fields without ever thinking about
restructuring the database.

However, this poses some additional problems, like
garbage collection of unused objects and the like.

[reply] [top]


[»] Big category of non-SQL databases missing
by White Owl - Nov 16th 2001 13:20:44

I mean MUMPS language/DBMS. For example: http://sanchez-gtm.sourceforge.net.

[reply] [top]


[»] CDS/ISIS database
by Horacio Degiorgi - Oct 19th 2001 03:46:39

one more,
Isis is a simple, yet powerful database system with a large installed base since the 80s. Since it's well suited for bibliographic data, it's commonly used in libraries, and since it's very low cost, especially in those running on a low budget.
Introduction to the isis db An isis DB is a list of rows of unspecified structure, each identified by a unique number, the rowid (a.k.a. mfn).
Each row is a list of fields, and each field has number (tag) and a string value. Within a row there may be zero, one or more fields with a given tag. While the field's value usually is a textual representation of data in one or the other character encoding (commonly one of the IBM/DOS code pages), it may actually contain arbitrary bytes. (Extract from http://www.openisis.org )

[reply] [top]


[»] Clip
by Uri Hnykin - Oct 7th 2001 08:03:20

There is one more Clipper-compatible compiler:
http://www.english.itk.ru/clip/index.html

--
clipper compatible compiler

[reply] [top]


[»] What about MetaKit?
by Robert Minichino - Oct 6th 2001 17:19:42

MetaKit is an embeddable high-power database
engine, with lots of language bindings. You can see
it at http://www.equi4.com/metakit/

[reply] [top]


[»] Another missing category
by Hubert Tonneau - Oct 6th 2001 12:17:26

Pliant is implementing a completely different database engine.
The general idea is to store the database as a tree, which is much more flexible that the 'tables' model, and is mapping staight forward the URL notion. With Pliant database engine, a database is stored in a single file even if it's a complex database.
The second idea is that the database is handled in the main memory, and the disk file format is HTML like ASCII with each modification appent immediately at the end of the file so that killing the database server process won't hurt.

See Pliant project at http://pliant.cx/ for extra details.

[reply] [top]


    [»] Re: Another missing category
    by Patrice Ossona de Mendez - Jan 22nd 2002 10:36:16

    Notice also that Pliant project may be found while browsing the category Database :: Database Engines/Servers.

    --
    Pom

    [reply] [top]


[»] Another one
by j - Oct 6th 2001 07:40:33

Another one for indexing constant data, similar to CDB, is PureDB .
It's used to index virtual users in Pure-FTPd, but it's also distributed as an independent set of library .
The API is simple and the reading library is extremely small.

--
Theoretically it is possible that all quantums in my body decide to tunnel through space-time at exactly the same time and with the exactly samedirection and speed vector for exactly the same duration of time and that I am suddenly effectively teleported to the surface of the moon.

[reply] [top]


    [»] DynDB
    by j - Oct 6th 2001 07:47:23

    I'm awfully sorry for the double post ("reload" did the trick).
    An interesting alternative to ndbm/gdbm/sdbm is DynDB .
    DynDB works very well for non-constant data, because it supports concurrent writes (something that *dbm libraries don't support) .
    It merges the efficiency of cdb with the write abilities of *dbm.
    Have a look, it's really a promizing project.

    --
    Theoretically it is possible that all quantums in my body decide to tunnel through space-time at exactly the same time and with the exactly samedirection and speed vector for exactly the same duration of time and that I am suddenly effectively teleported to the surface of the moon.

    [reply] [top]




© Copyright 2008 SourceForge, Inc., All Rights Reserved.
About freshmeat.net •  Privacy Statement •  Terms of Use •  Trademark Guidelines •  Advertise •  Contact Us • 
ThinkGeek •  Slashdot  •  Linux.com •  SourceForge.net  •  Jobs