
Mini SQL
2.0 General Info
During mSQL's life, people have used it for applications far beyond the
scope of the original design. These high-end applications, containing
up to 1 million rows of data, showed a need for better handling of
complex queries and large data sets if the package was to be used in
this way. The second generation of the mSQL server has been designed to
suit these high-end applications while maintaining the original design
goals of mSQL 1. It has been designed to meet three main criteria
mSQL 2.0 provides much more sophisticated indexing support. Each table
can have multiple indicies defined for it's data, with each index
consisting of one to ten fields. Any index can be designated as a
unique or non-unique index. The index information is stored in a series
of B-Tree structures that are mapped into the virtual memory address
space of the mSQL server process. The use of B-Trees in this way
ensures that access to key data is extremely fast.
To aid in the use of the indicies during query execution, a layer of
abstraction know as the "candidate rows" system has been introduced
into the server. The concept of the candidate rows abstraction is that during
query processing, the module performing the query requests the next
row from the data table that is a candidate for the selection criteria
specified in the query. The requesting module is not aware of the
mechanisms used to determine how that row was chosen or accessed. The
"candidate row" routines are responsible for determining the best access
method (based on the conditions specified in the where clause) and for
retrieving the data rows as they are requested. This ensures that the
optimum access method is used whenever a row of data is accessed without
replicating the access logic in each module and without any "special
case" algorithms. The candidate row abstraction also provides the initial
framework for query optimisation and more sophisticated query execution
planning.
To overcome this problem, mSQL 2.0 includes support for a variable
length char type (varchar). The varchar type allows an unrestricted
amount of data to be inserted into a field by using an overflow buffer
scheme to
hold data beyond the specified size of the field. This provides the
best of both worlds in that the database designer can specify the
average size of a char field ensuring that in most cases, the data will
be held in the data table. If a value is inserted that is longer than
average, it will be split between that data table and the overflow buffers.
This eliminates the need to specify overly large fields (e.g. 255
character) for storage of URLs and filenames.
To provide a more complete SQL environment future snapshot releases of
mSQL will include more of the "standard" data types defined by the SQL
standard. These will include date/time, currency, and various other
types that are provided by the larger database systems.
Support for multiple back-end processes implies that locking will be
added to mSQL 2.0. Initially, the granularity of the locks will be at
the table level with support for shared read locks and exclusive write
locks. Support for locking is the first step towards the support of
transactions. Development of transaction management is planned for mSQL
2.0 and will appear in either a future snapshot release of the initial
full release of the package.
To fill another problem associated with delivering "real" applications
over the web, W3-mSQL provides an enhanced and flexible authentication
system. Any page that is accessed via W3-mSQL is subjected to the new
W3-auth access scrutiny. Access can be restricted via a combination of
username/passwd and requesting host. Configuration of the security
system, including management of user groups, definition of secure areas,
and creation of authorised users, is via a graphical interface accessed
via a web page.
Access to mSQL from scripting languages has become popular and virtually
all major scripting languages provide an interface to the original mSQL
server. Support for script based access to mSQL becomes standard in
mSQL 2.0 with the inclusion of it's own scripting language. The
language, called Lite, is a stand-alone implementation of the language
provided by W3-mSQL and includes access to the mSQL API and the other
function mentioned above. Lite, as it's name implies, is a light weight
language yet provides a powerful and flexible programming environment.
The syntax of the language will be very familiar to C programmers (and
ESL programmers) and provides shell like ease of use. A future release
of Lite will include support for ASCII forms to provide a rapid
development environment for non graphical mSQL based applications.
Introduction
Mini SQL 2.0 is the second generation of the mSQL database system. The
first generation product, mSQL 1.0, was designed to provide high speed
access to small data sets. The
original goal was to perform 100 basic operations per second on an
average UNIX workstation with small data sets using very few system
resources (i.e. memory and CPU cycles). The original design goal was met
and the software has proven to be popular because of this functionality.
Enhanced Indexing
One of the major flaws of mSQL 1.0 when used for larger applications was
the simplistic indexing support. The original server supported only a
single primary key per table and the key could consist of only one field.
The internal use of the key was restricted to queries using a simple
equality condition. As such, the vast majority of queries were processed
without the aid of the key.
Data Types
Another of the limiting factors of the performance of mSQL 1.0 was the
size to which tables grew. Given an increasing number of rows, the
amount of data that needed to be manipulated in memory increased
proportionally. Unfortunately, the fixed length field structure of mSQL
1.0 usually forced a lot of white space and field padding to be included
in the data.
Server Architecture
Athough not available in the snapshot 1 release of 2.0, the server has
been redesigned to execute multiple queries at the same time. This is
achieved by starting multiple back-end query processors communicating
with the client applications via a single, shared, master process. The
master process is responsible for accepting client connections,
allocating work to the various back-end processes, and performing admin
tasks. This not only allows multiple queries to be performed in
parallel, it also increases the number of client connections that can be
handled. It is envisaged that this structure will be able to support
over 100 simultaneous client connections.
Tools
mSQL 2.0 is bundled with a couple of new tools to aid in the development
of applications. W3-mSQL 2.0, the second generation WWW interface
package is included as a standard tool. The new W3-mSQL code provides a
complete scripting language, with full access to the mSQL API, within an
HTML tag. This tool can be used to develop sophisticated GUI based
applications that are platform independant and available as shared
resources on a network. Along with the mSQL API, a library of nearly 60 other
functions, including file I/O, strings handling and date/time manipulated
are available to the scripts within a W3-mSQL enhanced web page.