www.fgks.org   »   [go: up one dir, main page]

 TOC 
HTSQL ManualC. Evans
  Simonov
 Prometheus Research, LLC.
 August 2006


HyperText Structured Query Language

Abstract

HyperText Structured Query Language (HTSQL) is an extension to the HTTP/1.1 protocol that allows clients to remotely access a standard SQL database. This extension provides a mapping of HTTP requests into SQL statements, producing a response that corresponds to the result generated by the database. This document defines a URI scheme used for this mapping, together with a coherent set of HTTP methods, headers, and entity body formats.



Table of Contents

1.  Introduction
    1.1.  Audience
    1.2.  Objective
2.  Preview
    2.1.  Predicate Expressions
    2.2.  Selectors and Functions
    2.3.  Specifiers
    2.4.  Locators and locations
    2.5.  Aggregates
    2.6.  Commands
    2.7.  Path Contexts
    2.8.  Transactions and Locking
    2.9.  Resources and Formats
    2.10.  HTML FORM Compatibility
3.  References
    3.1.  Normative References
    3.2.  Informative References
Appendix A.  Collected ABNF for URI
Appendix B.  Sample Database Schema
§  Index
§  Authors' Addresses




 TOC 

1.  Introduction

HyperText Structured Query Language (HTSQL) is a mechanism for accessing industry standard SQL [ISO9075‑1992] (International Standards Organization, “Database Language SQL, 1992,” July 1992.) data sources over HTTP [RFC2616] (Felding, R., Gettys, J., Mongul, J., Frystyk, H., Mastiner, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” June 1999.). This document specifies a URI scheme [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifiers (URI): Generic Syntax,” January 2005.), HTTP methods and extensions which enable database access from standard web browsers. The principal advantage of HTSQL is the expression of queries in a concise web-friendly syntax: for for common database tasks, path-based URIs are both simple and intuitive. A secondary advantage of HTSQL is the integrated use of the HTTP protocol to provide authentication, data caching, encryption, content negotiation, and numerous other network operations. This approach to SQL-over-HTTP puts forth a reusable, application independent, and testable middleware layer which translates HTTP requests into SQL statements, returning the execution results to the user in a format their user-agent can handle.

  /----------------\   HTTP Request     /-------------------------\
  | Web Browser    |  ----------------> --.                       |
  |  * Direct URLs |    URI, headers,   |  \      .--> Generated  |
  |  * HTML / XSLT |    post/put body   |   .    /     SQL Query  |
  |  * Javascript  |                    |   HTSQL         |       |
  | GUI Front End  |   HTTP Response    |   /  ^          v       |
  |  * Java / C#   |  <---------------  <--.    \       DATABASE  |
  |  * Python, etc |    status code,    |    Query        |       |
  | Excel /w Macro |    headers, and    |    Results <..../       |
  | Command Line   |    formatted       |                         |
  \----------------/    results         \-------------------------/

An HTSQL processor accepts an HTTP Request, converts this request into an SQL query, executes this query against a relational database, and returns the result in a file format requested by the user agent.



 TOC 

1.1.  Audience

The target audience for HyperText Structured Query Language is not career programmers, nor is it casual users. HTSQL is designed for technical users including screen designers, database administrators, statisticians, medical researchers, and other "accidental programmers". HTSQL advances a human-friendly URI-based query syntax over traditional SQL queries, and HTTP over a more typical database access protocol. HTSQL puts a relational database "on the web" where it can be directly accessed by these expert users without wading through an application's chrome. HTSQL offers the following:

URIs are instantly familiar to users who have been using the web for many years; human-readable URIs provide direct control over database information which is often lacking in traditional systems.

A database accessable via a web browser with persistent URIs allows query results to be bookmarked and emailed to collaborators; such an interface also enables easy navigation using web browser controls.

HTSQL offers greater flexibility than a purely graphical user interface, which necessarly limits the kinds of retrievals that can be specified. With even moderate exposure, an advanced user can learn to modify a URI to achieve results beyond what a graphical interface may provide.

HTSQL builds upon existing standards. By using standard HTTP, the database access protocol need not be burdened with authentication, encryption, signatures, cache control, content types, detailed audit-trails, or other network-related issues.

HTSQL provides an additional layer for security policy enforcement. Since most database interactions can be expressed with an HTSQL request, direct access to the database is not required. HTSQL as middleware can be used to restrict access and monitor activity. HTSQL also provides a row-level permission mechanism.

Implicit in this approach is a compromise. While the majority of common database tasks can be expressed in human-readable URI format, a few will necessarly be elusive. In those uncommon cases, client-side processing or a server-side view/procedure is an alternative. While HTSQL is targeted for occassional programmers, it is also allows career programmers to quickly develop and deploy loosely coupled applications.



 TOC 

1.2.  Objective

While there are many approches for database access over the web, few of them directly support relational databases. Of those, most implementations are application specific, fail to achieve even basic coverage of SQL constructs, or do not take URIs and HTTP protocols into consideration. HTSQL aims to be generic, documented, and supportable. By providing a hybrid that is both application independent and human readable, HTSQL realizes the potential synergy of HTTP and SQL, putting the database itself "on the web".

HTSQL should support most SQL-92 operations, although full coverage of SQL functionality is explicitly not a goal. HTSQL will provide an extension mechanism for functions and commands as well as working well with server-side triggers, stored procedures and views.

HTSQL must support modern relational databases such as PostgreSQL, MySQL, SQLite, and even commercial offerings such as Oracle, Microsoft SQL Server, and IBM's DB2. It is not a goal to provide support for every feature of those databases: back-end portability across databases is desired.

For all but the most complicated database interactions, the corresponding URIs in HTSQL must be easy to read and understand. Since HTSQL is meant to be used by casual programmers, error messages must be informative and layered so that casual users are not frightened and expert users are given the details they need.

HTSQL must use HTTP/1.1 features for well-known operations such as authentication, caching, range requests and content negotiation; HTSQL should extend or augment existing HTTP/1.1 methods only when necessary.

HTSQL must allow for fine-grained access permissions as allowed by SQL-92, mapping application users onto specific database accounts and using database roles. HTSQL must permit further restrictions beyond those permitted by SQL and should work well with server-side trigger based checks.

HTSQL requests must be easily expressable from a web browser via an XSLT stylesheet or a Javascript expression. Further, client access libraries should be made available to make use of HTSQL easy within common programming languages such as C#, Python, Ruby, Perl and Java.

HTSQL must support the standard [HTML] (, “,” 2004.) FORM element for common database operations using HTTP/1.0, requiring full HTTP/1.1 only when necessary.

HTSQL should support a wide variety of query result formats, including JavaScript Object Notation [JSON] (, “,” 2004.) and the eXtensible Markup Language [XML] (, “,” 2004.) for standard Javascript and DOM enabled web browsers, as well as Comma Separated Variable [CSV] (, “,” 2004.) for spreadsheets and data analysis tools.

HTSQL should minimize configuration using catalog detail available in the database's information schema; basic functionality should not require supplementary information. Ideally, HTSQL should work out-of-the-box with existing database system deployments.

HTSQL should minimize server-side state; following as much as possible the principles of representational state transfer, [REST] (Fielding, R., “Architectural Styles and the Design of Network-based Software Architectures.,” 2000.). To allow usage of HTSQL from a browser's location bar, database updates may be submitted with the GET method; however, this relaxation can be disabled.

The HTSQL specification should permit usage of SQL-99 [ISO9075‑1999] (International Standards Organization, “Database Language SQL, 1999,” September 1999.) and SQL-2003 [ISO9075‑2003] (International Standards Organization, “Database Language SQL, 2003,” August 2003.) constructs when possible, but should not require these features for operation. HTSQL must support [UNICODE] (, “,” 2004.) and use [ISO8601] (, “,” 2004.) style dates.



 TOC 

2.  Preview

As a prelude to the formal specification, we provide a taste of HTSQL by presenting a limited set of URIs, associating each URI with an equivalent SQL expression. We will use, in sequence, three schemata that build upon each other to form a comprehensive task management system. The first example schema, op, deals with organizations, projects, and people.

  +-------------------+              +-------------------+
  | OP.PROJECT        |              | OP.ORGANIZATION   |
  +-------------------+              +-------------------+
  | prj_id         PK |--\       /---| org_id         PK |---\ ---\
  | name        NN,UK |  |       |   | name           NN |   |    |
  | status      NN,CK |  |       |   | is_active         |   |    |
  | client         FK |>---------/   | division_of    FK |>--/    |
  | start_date        |  |    .      +-------------------+  .     |
  | description       |  |     .                           .      |
  +-------------------+  |   project is        an organization    |
                        .|   related to        may be a division  |
    a project has zero . |   at most one       of a larger        |
    or more people who   |   organization      organization       |
    participate in it    |                                        |
                         |     +-------------------+              |
  +-------------------+  |     | OP.PERSON         |              |
  | OP.PARTICIPATION  |  |     +-------------------+              |
  +-------------------+  |     | org_id     FK,PK1 |>-------------/
  | project    FK,PK1 |>-/     | nickname   NN,PK2 |    .
  | _ppl_seq   FK,PK2 |>-------| _ppl_seq    NN,UK |   .
  | billing_rate      |   .    | full_name      NN |  each person
  +-------------------+  .     | email          UK |  is part of
                        .      +-------------------+  exactly one
    a person participates in                          organization
    zero or more projects

  PK - Primary Key      FK - Foreign Key          NN - Not NULL
  UK - Unique Key       CK - Check Constraint

This op schema supports several topological structures. We have trunk tables, organization and project. The organization table demonstrates a self-reference recursion and acts acts as a category to the project table. The person table is a branch of organization, often representing a "part-of" relationship. The participation table is a cross-product of project by person.

In this diagram, each table's primary key is marked with PK and any unique key is marked with UK. These candidate keys are a combination of columns which uniquely identify each row in the given table. Relationships between tables, drawn with a line, represent foreign key constraints. A foreign key defines a correspondence between rows in the source table (marked with FK) and rows in the referenced table. Mandatory columns are indicated NN; primary key columns are mandatory as well.



 TOC 

2.1.  Predicate Expressions

An HTSQL request URI typically starts with a single table (optionally prefixed by a schema). Following a table expression is an optional filter expression, denoted with a question mark, to limit the rows returned.


/organization

This request selects all rows from the organization table. By default, rows are ordered by the primary key, in this case, org_id. If the table name is not unique within the database catalog, the schema specifier can be included; since the organization table is within the op schema, this request could also be written as /op:organization.

      SELECT * FROM op.organization
      ORDER BY org_id;

      op:organization
      ---------------------------------------------------------------
      org_id      | name                    | is_active | division_of
      ------------+-------------------------+-----------+------------
      acorn       | Acorn Architecture      | true      |
      lake-apts   | Lake Shore Apartments   | true      | lakeside
      lake-carmen | Lake Carmen Towers      | true      | lakeside
      lakeside    | Lake Side Partners, LLC |           |
      meyers      | Meyers Construction     | true      |
      meyers_elec | Meyers Electric         | false     | meyers
      smith       | Rudgen, Taupe, & Smith  | false     |

To provide a more traditional expression grammar, HTSQL's URI scheme diverges significantly from the common quoted-printable a=foo&b=bar URI scheme. Although a conjunction of key/value pairs is adequate for many uses, it is too limited for general query representation. Hence, HTSQL uses quoted literals instead of relying upon position and does not assume a top-level conjunction.


/person?nickname='jack'

This request returns rows from person with a nickname of 'jack'. Following SQL conventions, string literals such as 'meyers' are single-quoted to distinguish them from column references, such as nickname. Like SQL, if a single-quote occurs in a value, it is escaped by doubling up, e.g. 'O''Mally'.

      SELECT * FROM op.person
      WHERE nickname = 'jack'
      ORDER BY org_id,  nickname

      op:person
      ------------------------------------------------------------
      org_id | nickname | full_name      | email
      -------+----------+----------------+------------------------
      meyers | jack     | Jack C. Meyers | jack.meyers@example.com
      smith  | jack     | Jack Taupe     | jack.taupe@example.com

Within a single-quoted literal value, non-printable characters may be represented using [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifiers (URI): Generic Syntax,” January 2005.) precent-encoding. As such, the percent sign (%) must be written as %25. To be resilient against arbitrary percent-encoding done by a user-agent, even if encoded, a single-quote occurring in a literal value must still be escaped by doubling -- simply percent encoding is not sufficient. Hence, 'We''re already running at 110%25 capitan!'


/project?status!='abandoned'&name~'roof'

In HTSQL, the conjunction (boolean AND) is indicated with the ampersand (&). This particular request returns returns rows from project where the status does not equal 'abandoned' and where the project's name matches a case-insensitive [POSIX_1003.2] (, “,” 2004.) extended regular expression 'roof'. Case sensitive regular expressions are indicated by doubling-up the tilde (~~).

      SELECT * FROM op.project
      WHERE status != 'abandoned'
        AND LOWER(name) LIKE '%roof%'
      ORDER BY prj_id;

      project
      -------------------------------------------------------+ ...
      prj_id | name                            | status      |
      -------+---------------------------------+-------------+ ...
      la-334 | Siding / roof at 334 Ocean Blvd | completed   |
      lt-802 | Toaster Re-Do and Roof Leak     | in-progress |

Since comparison using SQL92's LIKE or SQL2003's SIMILAR-TO is easily represented as a regular expression, HTSQL makes no provision to support these operators. For databases that do not natively support regular expressions, common cases can be translated into a corresponding SQL construct as shown in the example above.



/project?status='planned'|status='in-progress'

Alternation (boolean OR) is indicated with the vertical bar (|). When typing this URL into a standard browser, the vertical bar may be converted into its percent-encoded equivalent, (in this case %7C). Although this may hinder readability, it does not impact the interpretation of the URL. More generally, any character in HTSQL may be percent encoded without changing its meaning.

      SELECT * FROM op.project
       WHERE status = 'planned'
          OR status = 'in-progress'
      ORDER BY proj_id

      project
      --------------------------------------------------+ ...
      prj_id   | status      | client      | start_date |
      ---------+-------------+-------------+------------+ ...
      lt-711   | planned     | lake-carmen |            |
      lt-802   | in-progress | lake-carmen | 2006-11-23 |
      overhead | in-progress |             | 2003-06-12 |
      ...

The any() macro provides a syntax shorthand for applying a comparison operator over a list of values. The example above could be equivalently written: /project?status=any('planned','in-progress'),



/organization?is_active&division_of!=='meyers'

This particular request returns organization records that are active and are not a division of 'meyers'. In HTSQL, boolean valued expressions don't need a corresponding comparison operator, hence is_active can be used directly. Note this result set includes rows where division_of is null.

      SELECT * FROM op.organization
      WHERE is_active
        AND division_of IS DISTINCT FROM 'meyers'
      ORDER BY org_id;

      organization
      ------------------------------------------------------------
      org_id      | name                | is_active | division_of
      ------------+---------------------+-----------+-------------
      acorn       | Acorn Architecture  | true      |
      lake-carmen | Lake Carmen Towers  | true      | lakeside
      meyers      | Meyers Construction | true      |
      shoe        | Rwyler's Shoes      | true      |

Observe that the distinct (!==) and not-distinct (==) operator differ from equality in that they treat NULL as a comparable value. If the example above would have used not-equal, division_of!='meyers', then only one row, with lakeside would have been returned.



/project?client->!start_date

HTSQL uses implicit conversion when a non-boolean expression is used in a predicate without a comparison operator. This particular example returns project rows where the existence of a client implies that the project has not yet started. This operator makes it easier to express consequential logical operations in a clear manner.

      SELECT * FROM op.project
       WHERE NOT (client IS NOT NULL AND client != '')
          OR (NOT (start_date IS NOT NULL))
      ORDER BY prj_id

      project
      -------------------------------------------------------------
      prj_id   | client      | start_date | name
      ---------+-------------+------------+------------------------
      lt-711   | lake-carmen |            | Updating Fire Escape
      overhead |             | 2003-06-12 | General Management Work

Expressions which evaluate to the empty string, to a zero value, or to NULL are considered false. In the example above the text-valued column client is equivalent to (!isnull(client)&client!=''). This particular example also has the implies operator, where a->b is syntax sugar for !a|b.



/project?(start_date<'2004-04-01'|start_date>'2006-09-01')&client

Parentheses can be used to group boolean expressions. This request returns projects that have not only been assigned to a client, but also either happen before '2004-04-01' or after '2006-09-01'

      SELECT * FROM op.project
       WHERE (start_date < '2004-04-01'
           OR start_date > '2006-09-01')
         AND (client IS NOT NULL AND client != '')
      ORDER BY prj_id

      project
      -----------------------------------------------
      prj_id | start_date | client      | status
      -------+------------+-------------+------------
      la-102 | 2004-03-27 | lake-apts   | completed
      lt-802 | 2006-11-23 | lake-carmen | in-progress

As in SQL, conjunction has higher precedence than the alternation. The parentheses above are necessary; if omitted, it would return all projects before 2004-03-27 including ones not assigned to a client.

With moderate exposure to these sorts of URIs and the corresponding results, the occasional programmer should be able to combine operators in a generative manner without assistance. Filter expressions can be combined with boolean operators and grouped with parentheses to generate arbitrarily complex predicates.



 TOC 

2.2.  Selectors and Functions

A sequence of expressions enclosed in curly brackets, called a selector, represents a set of correlated columns from related tables. When a selector immediately follows a table reference, it specifies which columns should be returned. Each expression a selector may be followed by a plus or a minus sign to indicate an ascending or descending sort order.


/project{status+,start_date-,name}

The selector above names three columns, ordered ascending by status and then descending by start_date. A third column, name, is returned, but is not used for sorting. HTSQL makes no provision to order results by columns that are not returned, or to list columns in an order that differs by their appearance in the sort criteria.

      SELECT status, start_date, name
      FROM op.project
      ORDER BY status ASC, start_date DESC, prj_id ASC

      project
      -----------------------------------------------------------
      status      | start_date | name
      ------------+------------+---------------------------------
      abandoned   | 2006-08-03 | Smith Associate Window and Roof
      completed   | 2005-08-09 | Smith Entry and Waiting Room
      completed   | 2005-02-03 | Smith Balcony Expansion
      ...


/{today(),null(),true(),false(),pi()}

In HTSQL, functions, such as today() as well as many constants, such as null(), are represented in standard functional notation. In the request above, a table identifier is absent, so a single row is returned with the expression requested.

      SELECT CURRENT_DATE, NULL, TRUE, FALSE, PI()

      ------------------------------------------------------
      today()    | null() | true() | false() | pi()
      ------------+--------+--------+---------+-------------
      2008-05-22 |        | true   | false   | 3.14159265359


/project{prj_id,string(start_date)[-5:].replace('-','/')}

HTSQL has full complement of string, numeric, and date functions which are translated into equivalent SQL. Most are direct translations, however, some expressions such as the slice operator (inspired from Python), have a more succinct syntax than the corresponding SQL.

      SELECT prj_id,
        REPLACE(
          SUBSTRING(CAST(start_date AS TEXT)
            FROM (((- 5) + (
               CASE WHEN (- 5) < 0
               THEN CHARACTER_LENGTH(CAST(start_date AS TEXT))
               ELSE 0 END))+1)
        ), '-', '/')
      FROM op.project
      ORDER BY prj_id

      project                                                |
      -------------------------------------------------------+-
      prj_id   | string(start_date)[(-5):].replace('-', '/') |
      ---------+---------------------------------------------+-
      la-102   | 03/27                                       |
      la-334   | 04/20                                       |
      ...


/project{prj_id,recode(status,'completed','done')}

HTSQL as full support for SQL's CASE statement depending upon the syntax variant. The if() function corresponds to the general form, while switch() is translated into the initial expression variant. To reduce expression duplication error in request construction, HTSQL introduces an additional recode function that returns the initial expression in the case where it is not matched.

      SELECT prj_id,
             (CASE status
              WHEN 'completed' THEN 'done'
              ELSE status END)
      FROM op.project
      ORDER BY prj_id

      project                                          |
      -------------------------------------------------+-
      prj_id     | recode(status, 'completed', 'done') |
      -----------+-------------------------------------+-
      Bowl-Shoes | planned                             |
      la-102     | done                                |


/project{prj_id,start_date+time('03:30')+timedelta(365*4,1,5)}

Standard date/time arithmetic is supported, with direct translations into the SQL equivalent. A numeric form of for timedelta is provided for convenience for use with contexts requiring arithmetic operations.

      SELECT prj_id, ((start_date + TIME '03:30:00')+
            CAST(( CAST(365 * 4 AS TEXT) || 'D '
                || LPAD(CAST(1 AS TEXT), 2, '0') || ':'
                || LPAD(CAST(5 AS TEXT), 2, '0') || ':'
                || '00'
            ) AS INTERVAL))
      FROM op.project
      ORDER BY prj_id

      project
      ---------------------------------------------------------------
      prj_id  | ((start_date+time('03:30'))+timedelta((365*4), 1, 5))
      --------+------------------------------------------------------
      la-102  | 2008-03-26 04:35:00
      la-334  | 2008-04-19 04:35:00
      ...


/participation{floor(billing_rate div 2),(billing_rate mod 3)}

For division and modulo a syntax exception is needed. The usual forward-slash (/) cannot be used since it designates path segments, while the percent sign (%) is inconvenient due to it's use as an escape character. To work around this syntactical inconvenience, HTSQL follows XSLT's usage of using the div and mod keywords respectively.

      SELECT FLOOR(billing_rate / 3),
             billing_rate % 3
        FROM op.participation
      ORDER BY ...

      participation                                      |
      ---------------------------------------------------+-
      floor((billing_rate div 3)) | (billing_rate mod 3) |
      ----------------------------+----------------------+-
      18                          | 1.00                 |
      23                          | 0.00                 |
      ...

The equivalent SQL query shown above is not completely accurate; in particular, additional default outer joins are not shown and the default order by clause requires additional explanation.



 TOC 

2.3.  Specifiers

HTSQL provides a mechanism for referencing columns not only in the current table, as seen in prior examples, but also columns from related tables. In SQL, relationships between tables are declared with a foreign key constraint; a specifier associates rows from related tables by joining on these constraints. Specifiers are written as sequence of column and/or table names separated by periods, each period representing a join. Specifiers with two or more periods reflect a transitive join, forming a path from one table to another through intermediates.


/project?client.is_active

This request returns project rows that are assigned to an active client. Because there is a foreign key reference from client in the project table to org_id of the organization table, the HTSQL processor automatically constructs the appropriate join logic.

      SELECT p.* FROM op.project AS p
        LEFT OUTER JOIN op.organization AS o
          ON (p.client = o.org_id)
       WHERE o.is_active IS TRUE
      ORDER BY prj_id

      project
      ------------------------------------------------- ...
      prj_id | status      | client      | start_date |
      -------+-------------+-------------+------------+ ...
      lt-711 | planned     | lake-carmen |            |
      lt-802 | in-progress | lake-carmen | 2006-11-23 |

In HTSQL, these joins are singular, meaning that for every row of the driving table project there is at most one row from the linked table, organization. With the default HTSQL meta-data configurator, when only one foreign key reference exists to a table, the name of the referenced table (organization) can be used as well as the referencing column (client), e.g., /op:project?organization.is_active.



/participation?person.organization.is_active

This request returns participation records for people who are in active organizations. This is accomplished by a transitive application of two joins, one from participation to person, and then onto organization. The resulting link chain is also singular since both sub-links are singular.

      SELECT x.* FROM op.participation AS x
        LEFT OUTER JOIN op.person AS p
          ON (x._ppl_seq = p._ppl_seq)
        LEFT OUTER JOIN op.organization AS o
          ON (p.org_id = o.org_id)
       WHERE o.is_active IS TRUE
      ORDER BY t.project, o.org_id, p.nickname

      participation
      -------------------------------------
      project  | person      | billing_rate
      ---------+-------------+-------------
      la-102   | meyers.hill | 55.00
      la-102   | meyers.jack | 69.00
      ...

The equivalent SQL query and result set shown above are not completely accurate; in particular, the person column and the order by clause require additional explanation covered in the next section.



/person?organization.division_of.org_id='lakeside'

Of course, a transitive join to the same table is possible. This request returns people who are in an organization that is a division of 'lakeside'.

      SELECT p.* FROM op.person AS p
        LEFT OUTER JOIN op.organization AS o1
          ON (p.org_id = o1.org_id)
        LEFT OUTER JOIN op.organization AS o2
          ON (o1.org_id = o2.org_id)
       WHERE o2.is_active IS TRUE
       ORDER BY p.org_id, p.nickname

      person
      ---------------------------------------------+-
      org_id    | nickname | full_name     | email |
      ----------+----------+---------------+-------+-
      lake-apts | tom      | Tommy O'Mally |       |
      ...

A future expansion for HTSQL may provide a syntax and corresponding translation for recursive queries, as implemented by SQL's WITH clause. Suggestions are welcome since this is a particularly useful, but very complicated construct to specify.



/project{*,client.*}

In a manner like SQL, all columns can be requested in a selector using the asterix (*). In this case, the columns returned are prefixed using the specifier of the join construct.

      SELECT p.*,
             o.org_id AS "organization.empl_code",
             e.name AS "organization.name",
             e.is_active AS "organization.is_active",
             e.division_of AS "organization.division_of"
        FROM op.project AS p
             LEFT OUTER JOIN op.organization AS o
               ON (o.org_id = p.client)
      ORDER BY p.prj_id

      project
      ------------------------ . --------------------- ...
      prj_id   | status      | . |organization.org_id  ...
      ---------+-------------+ . +-------------------- ...
      la-102   | completed   | . |lake-apts            ...
      la-334   | completed   | . |lake-apts            ...
      lt-711   | planned     | . |lake-carmen          ...
      ...


/project{prj_id,status,client{name,is_active}}

In this example, both the name and is_active columns from the organization table are requested. Using a nested selector in this manner avoids duplicating table references. The selector client{name,is_active} is short-hand for client.name,client.is_active.

      SELECT p.prj_id, p.status,
             o.name AS "client.name",
             o.is_active AS "client.is_active"
        FROM op.project AS p
             LEFT OUTER JOIN op.organization AS o
               ON (p.client = o.org_id)
      ORDER BY p.prj_id

      project
      ---------------------------------------------------------------
      prj_id   | status      | client.name        | client.is_active
      ---------+-------------+--------------------+------------------
      lt-711   | planned     | Lake Carmen Towers | true
      lt-802   | in-progress | Lake Carmen Towers | true
      ...


/organization?project.status='completed'

This request returns organizations that are associated with at least one completed project. Since there may be more than one project for a given organization, HTSQL interprets the this request as an implied existence test.

      SELECT o.* FROM op.organization AS o
       WHERE EXISTS
         (SELECT * FROM op.project AS p
           WHERE p.client = o.org_id
             AND p.status = 'completed')
      ORDER BY e.org_id

      organization
      ------------------------------------------------------------
      org_id    | name                   | is_active | division_of
      ----------+------------------------+-----------+------------
      lake-apts | Lake Shore Apartments  | false     | lakeside
      smith     | Rudgen, Taupe, & Smith | false     |
      ...

Unlike the previous examples, the cardinality of this join is plural: there may be more than one project associated with given organization. When used in this manner, such a specifier is called plural and it checks for existence of at least one matching row.

HTSQL's specifier mechanism enables intuitive construction of complicated join criteria. Not only are singular (one-to-one or many-to-one) joins allowed, but plural (one-to-many) joins are also supported.



 TOC 

2.4.  Locators and locations

HTSQL provides explicit support for selecting particular rows of a given table using primary key columns. When using this syntax, each value associated with a primary key column is called a label, and a dotted sequence of labels is called an location. Labels are compared via usual equality as described above. HTSQL uses square brackets to enclose a comma-separated list of locations that locate rows within the database. A sequence of locations enclosed in square brackets is called a locator.


/organization[meyers]

This request returns the meyers organization. This locator syntax form provides a short-hand for filtering by primary key and is indentical to the more formal /op:organization?org_id='meyers'.

      SELECT * FROM op.organization
      WHERE org_id = 'meyers'
      ORDER BY org_id

      proj_id  | name               | description
      ---------+--------------------+-----------------------------
      MEYERS   | Meyers' Residence  | insulation and winterizing


/person[meyers.jim]

The full-stop (.) is used used to separate labels in cases where the primary key includes more than one column. This example is identical to /op:person?organization.org_id='meyers' &nickname='jim'

      SELECT p.* FROM op.person AS p
       WHERE p.org_id = 'meyers'
         AND p.nickname='jim'
       ORDER BY p.org_id, p.nickname

      person
      -------------------------------------------------------
      org_id | nickname | full_name  | email
      -------+----------+------------+-----------------------
      meyers | jim      | Jim Meyers | jim.meyers@example.com



/person[meyers.jim,meyers.hill]

A comma (.) may be used to separate additional locators The example above is identical to, /op:person?organization.org_id='meyers'& (nickname='jim'|nickname='hill') and could also be expressed a bit shorter as /person[meyers.(jim,hill)]

      SELECT p.* FROM op.person AS p
       WHERE p.org_id = 'meyers'
         AND p.nickname IN ('jim','hill')
      ORDER BY p.org_id, p.nickname

      person
      --------------------------------------------------------------+-
      org_id | nickname | full_name        | email                  |
      -------+----------+------------------+------------------------+-
      meyers | hill     | Mark Thomas Hill | mark.hill@example.com  |
      meyers | jim      | Jim Meyers       | jim.meyers@example.com |


/person[meyers.*]

Components in an identifier may be unknown, replaced instead with the wildcard (*). Any component in a location may be set to a wildcard.

      SELECT p.* FROM op.person AS p
       WHERE p.org_id = 'meyers'
      ORDER BY p.org_id, p.nickname

      person                                                         |
      ---------------------------------------------------------------+-
      org_id | nickname | full_name        | email                   |
      -------+----------+------------------+-------------------------+-
      meyers | hill     | Mark Thomas Hill | mark.hill@example.com   |
      meyers | jack     | Jack C. Meyers   | jack.meyers@example.com |
      meyers | jim      | Jim Meyers       | jim.meyers@example.com  |
      ...


/person{id(),*}

This row locator can returned using a built-in function, id() which does the appropriate construction.

      SELECT (p.org_id || '.' || p.nickname) AS "id()", p.*
        FROM op.person AS p
      ORDER BY p.org_id, p.nickname

      person
      --------------------------------------------------------- ...
      id()          | org_id    | nickname | full_name        |
      --------------+-----------+----------+------------------+ ...
      acorn.hideo   | acorn     | hideo    | WATANABE Hideo   |
      lake-apts.tom | lake-apts | tom      | Tommy O'Mally    |
      ...

The id() function is a bit more complicated than this since the locator production is limited to word characters and the dash (-) and underscore (_). Labels which do not match this production must be single-quoted.


Identifiers provide a handy notation for resource location: they are concise and unique. Identifiers can be explicitly requested in the selector using id(); furthermore, they can be used within a locator to return exactly one row.



 TOC 

2.5.  Aggregates

In HTSQL, a command denoted by parentheses may be used in the last (right most) path-segment of the URI to invoke a specific database operation or extension function. In the previous examples, the command select() was implicit.


/op:project{proj_id,task.task_no}

If a plural specifier (which causes a one-to-many join) is used within a selector, an ARRAY is returned. This example returns a row for each project, and for each row, an array of associated task numbers.

      SELECT p.proj_id,
             ARRAY(SELECT t.task_no
                     FROM tm.task t
                    WHERE t.proj_id = p.proj_id
                    ORDER BY t.proj_id, t.task_no)
                      AS "task{task_no}"
        FROM op.project AS p
      ORDER BY p.proj_id;

      proj_id  | task{task_no}
      ---------+-----------------------------
      MEYERS   | {1,2,3}
      SSMall   | {1}
      THOM-LLP | {}
      ...


/tm:employee{full_name,isnull(email),count(task)}

Functions may be used in the context of a selector. In this example, isnull() is a scalar function while count() is an aggregate function. Aggregate functions may be applied to plural specifiers, such as task.

      SELECT e.full_name,
             (e.email is null) AS "isnull(email)",
             count(t.*) AS "count(task)"
        FROM tm.employee AS e
             LEFT OUTER JOIN tm.task AS t
             ON (t.assigned_to = e.empl_code)
      GROUP BY e.empl_code, e.full_name, (e.email is null)
      ORDER BY e.empl_code;

      full_name       | isnull(email) | count(task)
      ----------------+---------------+-------------
      Adam O'Brien    | FALSE         | 23
      ...
      Alfred Smith    | TRUE          | 0
      ...

In general, most SQL operations such as IS NULL are available in HTSQL. However, we use a simplified function notation that is more familiar to occasional programmers; our syntax includes sequential argument passing an optional keyword parameters.



/tm:task{assigned_to,@status|count(*)}

Data pivoting can be requested using the asterix (@) immediately preceding a column, such as status. In this case, distinct values in that column become the headers in the result set, and remaining columns become the values within respective buckets.

:
      SELECT t.assigned_to,
         SUM(CASE WHEN t.status = 'done' THEN 1
                  ELSE 0 END) AS "done",
                  ...
         SUM(CASE WHEN t.status = 'review' THEN 1
                  ELSE 0 END) AS "review",
        FROM tm.task AS t
      GROUP BY t.assigned_to
      ORDER BY t.assigned_to;

      assigned_to | done | . | review
      ------------+------+ . +--------
      ADAM        |    0 | . |  1
      ARONSON     |    4 | . |  2
      ...
      (NULL)      |    0 | . |  0

This translation need not be done in SQL as shown above, since a pivot operation is strictly a visualization mechanism. The equivalent SQL above reflects an idiom for this sort of item. If more than one summary calculation or pivot is requested, there may be 2 or more header rows.



/tm:employee{empl_code,task}

If a plural specifier naming a table is mentioned in a selector, then an ARRAY of associated identifiers is generated. In this example, a row for each employee is returned, and for each row, an array of associated task identifiers is listed.

      SELECT e.empl_code,
             ARRAY(SELECT (t.proj_id || '.' || t.task_no)
                     FROM tm.task t
                    WHERE t.assigned_to = e.empl_code
                   ORDER BY t.proj_id, t.task_no) AS "task"
        FROM tm.employee AS e
      ORDER BY e.empl_code

      empl_code | task
      ----------+-----------------------------
      ADAM      | {ssmall.1, ... }
      ARONSON   | {meyers.1, meyers.5, ... }
      ...
      SMITH-A   | {}
      ...

Identifiers provide a handy notation for resource location: they are concise and unique. Identifiers can be explicitly requested in the selector using id(); furthermore, they can be used within a locator to return exactly one row.



 TOC 

2.6.  Commands

In HTSQL, a command denoted by parentheses may be used in the last (right most) path-segment of the URI to invoke a specific database operation or extension function. In the previous examples, the command select() was implicit.


/tm:employee/select(offset=10,limit=2)

The explicit select() command has two optional keyword/value arguments which can be used to return a sliding window over a result set. For this particular example, the result set starts at the 11th row and returns 2 rows.

      SELECT * FROM tm.employee
      ORDER BY empl_code
      OFFSET 10 LIMIT 2               /* PostgreSQL Syntax */

      empl_code | full_name     | is_contractor | email
      ----------+---------------+---------------+-------------
      HUCK      | Ed Huckington | TRUE          | huck@example.com
      JACK      | Jack Winters  | FALSE         | jack@example.com

Unfortunately, ISO SQL does not have a provision for offset/limit as implemented by PostgreSQL and other databases. However, the comments section in the SQL2003 hints that this feature is forthcoming. Since this feature is extremely useful and has implementations in just about every database, it is included in the HTSQL specification.



/op:project/insert()?proj_id:='ALBE'&name:='Alberca'

Other commands for insert, update, and delete follow a similar syntax, using colon-equal (:=) to indicate an assignment. The result from an insert statement is a "201 Created" with a content body containing the URI(s) of the objects inserted.

      INSERT INTO op.project (proj_id, name)
        VALUES ('ALBE','Alberca');

      201 Created
      /op:project[albe@1]

For RESTful behavior, the POST method with query arguments in a standard entity body format will also work; the GET method is permitted so that standard features can be used directly in a web browser's location bar.



/op:project/update()?proj_id='ALBE'&description:='Leaky Pool'

This request updates the description of the 'ALBE' project. Note that only columns using the assignment operator (:=) are changed; the remaining column references are used to limit the rows affected.

      UPDATE op.project
         SET description = 'Leaky Pool'
       WHERE htsql_normalize(proj_id) = 'albe'

      201 Created
      /op:project[albe@2]

Like the insert() operation, update() returns a "201 Created" when successful, listing the URIs of the affected resources. Even though the affected rows are actually modified, they constitute a new resource: implementations could permit access to previous versions of the modified row.



/op:project/delete(expect=3)?description~'pool'

If a data modification request would change more than one row (or less than one row), an expect keyword argument is needed. In this example, the request expects exactly 3 rows to be deleted.

      DELETE FROM op.project
       WHERE LOWER(description) LIKE '%pool%';

      204 No Content

Deleting a row does not return content. If any more or fewer rows would be affected, a "417 Expectation Failed" is returned and the data modification request is aborted.



/op:project[able]/merge()?name:='SouthWest%20Alberca'

The merge operation provides a succinct syntax for adding or updating a resource based on its identifier. If the row already exists, it is updated, otherwise it is created. The result of this command is a "201 Created" when the corresponding insert or update succeeds.

      MERGE INTO op.project
      USING op.project ON htsql_normalize(proj_id) = 'albe'
      WHEN MATCHED THEN UPDATE
        SET name = 'SouthWest Alberca'
      WHEN NOT MATCHED THEN INSERT
        (proj_id, name) VALUES
        ('albe', 'SouthWest Alberca');

      201 Created
      /op:project[albe@3]

The MERGE syntax above comes from SQL2003. However, the concept is so useful in a web setting that the corresponding SQL1992 transaction can be simulated if this feature is not natively supported. While HTSQL does not require spaces to be encoded as %20, many user-agents do.



/tm:task{task_no,status}/parse()?employee='adam'

The parse() command returns a parse tree representing the request. It is useful for clients which provide a graphical query builder. This command is also helpful when debugging because it shows how the HTSQL server is interpreting a given URI.


      <parse xmlns="http://htsql.org/2006/">
        <context schema="tm" table="task">
          <selector>
            <specifier column="task_no" />
            <specifier column="status" />
          </selector>
        </context>
        <operation name="parse" />
        <query>
          <comparison type="equality">
            <specifier table="employee">
            <literal value="adam" />
          </comparison>
        </query>
      </parse>

HTSQL provides the standard SQL commands, insert(), update() and delete(). The merge() function is particularly useful in a web environment where the status of an object is being "reset" regardless of its previous state.



 TOC 

2.7.  Path Contexts

In HTSQL, more than one table can be directly mentioned in a request, provided that the tables are related (perhaps transitively) by a foreign-key relationship. Each table, together with associated selectors and parameters is called a context.


/tm:employee/task

This request returns tasks, grouping by assigned employee. This particular result is possible since there is an assigned_to foreign key from task to employee. Employees which do not have corresponding tasks, or tasks which are not assigned to an employee are not returned.

      SELECT e.empl_code     AS "employee{empl_code}",
             e.full_name     AS "employee{full_name}",
             e.is_contractor AS "employee{is_contractor}",
             e.email         AS "employee{email}",
             t.proj_id       AS "task{proj_id}",
             t.task_no       AS "task{task_no}",
             t.assigned_to   AS "task{assigned_to}",
             t.status        AS "task{status}",
             t.name          AS "task{name}"
        FROM tm.task AS t
             JOIN tm.employee AS e
               ON (t.assigned_to = e.empl_code)
      ORDER BY e.empl_code, t.proj_id, t.task_no;


      employee{empl_code} | . | task{proj_id} | task{task_no} | .
      --------------------+ . +---------------+---------------+
      ADAM                | . | MEYERS        |       7       | .
      ADAM                | . | SSMall        |       2       | .
      ARONSON             | . | MEYERS        |       1       | .
      ...


/tm:employee[aronson]/task

The example above returns tasks that have been assigned to the employee aronson; duplicating information about this employee for each row. While this query is similar to /tm:task?assigned_to='aronson', it asserts that exactly one employee with identifier aronson is matched.

      SELECT <all columns from employee and task>
        FROM tm.task AS t
             JOIN tm.employee AS e
               ON (t.assigned_to = e.empl_code)
       WHERE htsql_normalize(e.empl_code) = 'aronson'
      ORDER BY e.empl_code, t.proj_id, t.task_no

      employee{empl_code} | . | task{proj_id} | task{task_no} | .
      --------------------+ . +---------------+---------------+
      ARONSON             | . | MEYERS        |       1       | .
      ...


/tm:employee?restricted_info.billing_rate>20

This request returns employees who have a billing rate of more than 20. The automatic join here is possible since the restricted_info table has a foreign key which refers to the employee table.

      SELECT e.*
        FROM tm.employee AS e
             JOIN tm.restricted_info AS r
             ON (r.empl_code = e.empl_code)
       WHERE r.billing_rate > 20
      ORDER BY e.empl_code

      empl_code |  full_name   | is_contractor |       email
      ----------+--------------+---------------+-------------------
      ARONSON   | Mary Aronson | FALSE         | mary2@example.com
      SMITH     | Ron Smith    | TRUE          | john@example.com
      ...

Tables, such as restricted_info, that have an optional one-to-one relationship with a primary table, like employee, are called facets. Facets allow the handling of sparse data sets and cases where particular sets of information must have additional security constraints.



/op:project[meyers]/task[1]

Identifiers may be shortened when the context can be used to fully-qualify them. In this example, the identifier for the requested task is meyers.1.

      SELECT <all columns from project and task>
        FROM op.project AS p
             JOIN tm.task AS t
               ON (t.proj_id = p.proj_id)
       WHERE htsql_normalize(p.proj_id) = 'meyers'
         AND htsql_normalize(t.task_no) = '1'
      ORDER BY p.proj_id, t.proj_id, t.task_no

      project{proj_id} | . | task{proj_id} | task{task_no} | .
      -----------------+ . +---------------+---------------+ .
      MEYERS           | . | MEYERS        |       1       | .


/op:project[meyers]/employee[aronson]/task/insert()?
task_no=4&name='Clean Up'

When more than one table is provided for an insert() statement, un-ambiguous links to the rows identified in the context are assumed. In this case, the new task will be part of the meyers project and will be assigned to aronson.

      INSERT INTO tm.task (proj_id, assigned_to, task_no, name)
      VALUES ((SELECT proj_id FROM op.project
                WHERE htsql_normalize(proj_id) = 'meyers'),
              (SELECT empl_code FROM tm.employee
                WHERE htsql_normalize(empl_code) = 'aronson'),
                '4', 'Clean Up')

In HTSQL, whitespace between tokens (but not within single or double quotes) is not significant. To enhance readability, we broke the request above onto two lines.



/tm:employee{full_name}/task{project.name,task_no}?status='done'

This request returns tasks that are 'done' together with detail regarding the assigned employee's full name, the project's name, and the task number. That tasks that are not assigned are not returned with this request.

      SELECT e.full_name AS "employee{full_name}",
             p.name      AS "task{project.name}",
             t.task_no   AS "task{task_no}"
        FROM tm.task AS t
             JOIN tm.employee AS e
               ON (t.assigned_to = e.empl_code)
             JOIN op.project AS p
               ON (p.proj_id = t.proj_id)
       WHERE htsql_normalize(t.status) = 'done'
      ORDER BY e.empl_code, t.proj_id, t.task_no

      employee{full_name} | task{project.name}  | task{task_no}
      --------------------+---------------------+--------------
      Mary Aronson        | Meyers' Residence   |       1
      ...


/tm:employee//task

To suppress the automatic joins, a double-slash (//) may be used; the result is a cross product. In this example, the usual join using the foreign key assigned_to is explicitly suppressed to return all permutations of employee and task.

      SELECT <all columns from employee and task>
        FROM tm.task AS t,
             tm.employee AS e
      ORDER BY e.empl_code, t.proj_id, t.task_no;

      employee{empl_code} | . | task{proj_id} | task{task_no} | .
      --------------------+ . +---------------+---------------+
      ADAM                | . | MEYERS        |       1       | .
      ADAM                | . | MEYERS        |       2       | .
      ..
      ARONSON             | . | MEYERS        |       1       | .
      ...


/tm:$a:=project//$b=:project?a.proj_id[0]==b.proj_id[0]

When using more than one copy of a table, a table alias created by $var:= is required. Once aliased, subsequent usage of the table is then referenced with var. This contrived example returns pairs of projects which have the same first letter in their project identifier.

      SELECT a.proj_id     AS "a{proj_id}",
             a.name        AS "a{name}",
             a.description AS "a{description}",
             b.proj_id     AS "a{proj_id}",
             b.name        AS "a{name}",
             b.description AS "a{description}"
        FROM op.project AS a,
             op.project AS b
        WHERE SUBSTRING(a.proj_id FROM 1 FOR 1) =
              SUBSTRING(b.proj_id FROM 1 FOR 1)
       ORDER BY a.proj_id, b.proj_id;

      a{proj_id} | . | b{proj_id} | .
      -----------+ . +------------+ .
      MEYERS     | . | MEYERS     | .
      MEYERS     | . | MIVDA      | .
      MIVDA      | . | MEYERS     | .
      ...


/tm:(+)employee/task

In HTSQL, an outer join is indicated with a plus ((+)) surrounded by parenthesis preceding the table name. The above request, for example, returns tasks even if they have not been assigned an employee. In these cases, the employee columns are NULL.

      SELECT <all columns from employee and task>
        FROM tm.task AS t
             LEFT OUTER JOIN tm.employee AS e
               ON (t.assigned_to = e.empl_code)
      ORDER BY e.empl_code, t.proj_id, t.task_no;


      employee{empl_code} | . | task{proj_id} | task{task_no} | .
      --------------------+ . +---------------+---------------+
      ADAM                | . | SSMall        |       2       | .
      ARONSON             | . | MEYERS        |       1       | .
      ...
      (NULL)              | . | MEYERS        |       3       | .
      ...


/tm:employee{full_name};is_contractor/(+)task{project.name,task_no}

This request returns each contractor in the employee table and their associated tasks, if any. Filters on a particular context are indicated with a semi-colon (;), the context's parameters. In this case, the is_contractor filter applies to the employee table.

      SELECT e.full_name AS "employee{full_name}",
             p.name      AS "task{project.name}",
             t.task_no   AS "task{task_no}"
        FROM tm.employee e
             LEFT OUTER JOIN tm.task AS t
               ON (t.assigned_to = e.empl_code)
             LEFT OUTER JOIN op.project AS p
               ON (p.proj_id = t.proj_id)
       WHERE e.is_contractor IS TRUE
      ORDER BY e.empl_code, t.proj_id, t.task_no

      employee{full_name} | task{project.name} | task{task_no}
      --------------------+--------------------+---------------
      Ron Smith           | Meyer's Residence  |             2
      Alfred Smith        | (NULL)             |  (NULL)
      ...

This set of examples illustrates the large difference in readability between HTSQL URIs and the corresponding SQL. While SQL may be more expressive, for common needs, HTSQL is more succinct and understandable. The "path" based metaphor of URIs, together with the parameter syntax using the semi-colon, allows relatively common joins to be easily specified.



 TOC 

2.8.  Transactions and Locking

HTSQL supports database transactions and record locking. Besides explicit row locking, bulk updates and optimistic locking are possible. Optimistic locking is accomplished by returning a row version; and then updating a row only when this version matches. Pessimistic locking requires either a session mechanism as provided by the application or its transaction is limited to the scope of the current HTTP connection (using "Connection: Keep-Alive" for HTTP/1.0 clients).


/tm:task{id(),tag(),idtag(),name}

The tag() function returns, for each row, a string value that can be used to test when an update has occurred. In a subsequent request, if the value of tag() changes, then the row has been updated. The return value of tag() is usually a row version number or a timestamp that is changed on each update. The idtag() function returns a row's identifier and tag as a single value, separated with an at-sign (@).

      SELECT (t.proj_id || '.' || t.task_no) AS "id()",
             <tag> AS "tag()",
             (t.proj_id || '.' || t.task_no ||
              '@' || <tag>) AS "idtag()",
             t.name AS "name"
        FROM tm.task AS t
      ORDER BY t.proj_id, t.task_no;;

      id()     | tag() | idtag()    | name
      ---------+-------+------------+-------------------
      MEYERS.1 | 3     | MEYERS.1@3 | Purchase Materials
      MEYERS.2 | 7     | MEYERS.2@7 | Strip Wall Paint
      ...

Since the notion of a tag is not specified by SQL92, its implementation may be dependent upon the specific database, schema, and configuration used. The only constraint placed upon tag() is that all future versions of updated rows must return a different value. If the HTSQL processor is unable to fulfill these requirements, it must respond with "501 Not Implemented".



/tm:task[meyers.2@7]

This request attempts to return version 7 of the task identified by meyers.2. As above, if the task does not exist, then a "404 Not Found" is returned. If the tag() of the requested row has since changed, a "301 Moved Permanently" is issued, giving the location of the current version.

      301 Moved Permanently
      /tm:task[meyers.2@8]


/op:project[meyers@2]/update()?name:='Changed%20Name'

If an update or delete operation is applied to an object that has the incorrect version, then the update request would fail with a "409 Conflict". This message will include a reference to the most recent version, as well as a body with the change history (if available) since the version requested.

      409 Conflict
      /op:project[albe@3]


/interaction()

The interaction() operation takes a POST body of type "text/htsql", each line being an HTSQL request. The requests are executed together as part of a single transaction; if any request fails with a 4xx or 5xx result code, then the entire transaction is rolled back. Following is an example request body which inserts a project and three tasks as a single transaction.

      /op:project[waterfall]/insert()?name:='Waterfall Example'
      /op:project[waterfall]/task[1]/insert()?name:='Feasibility'
      /tm:task[waterfall.2]/insert()?name:='Analysis Work'
      /tm:task[$(project).3]/insert()?name:='Design Phase'

Within text/htsql document, variables like $(table) resolves to the identifier for the most recent insert or merge statement of the given table. Hence, $(project) in this example is waterfall.



/op:project[waterfall]/task/insert()

Bulk inserts can also be performed using a CSV file in the POST body. The first line of the file gives the columns, and remaining lines correspond to each row being insert. In this case, the project, waterfall is obtained from the parent context and doesn't need to be repeated in the CSV file.

      task_no,status,name
      4,planned,Implementation
      5,planned,Testing
      6,planned,Deployment

A similar file can be provided for update() or merge(), however, these actions require at least one id() or idtag() column to uniquely locate the row being affected.



/begin()

This starts a transaction. An active transaction can be finished with either commit() or rollback(). The transaction is automatically abandoned if the current HTTP connection or the active session ends.

      BEGIN TRANSACTION

      204 No Content


/op:project/select(lock='update nowait')

This select parameter places a row lock on the returned rows. If a lock cannot be obtained, it immediately returns a "408 Request Timeout" error. This command must be preceded with a begin().

      SELECT *
      FROM op.project
      FOR UPDATE NO WAIT
      ORDER BY proj_id

      408 Request Timeout

Locking for update is assumed when either lock="wait" or lock="nowait". An explicit lock is dropped after an explicit commit() or rollback().


HTSQL provides for standard optimistic locking, bulk transaction operations, as well as standard transactions with explicit locking.



 TOC 

2.9.  Resources and Formats

Normally, when a web browser makes a request, it provides a list of result formats it will accept. HTSQL provides mechanisms to override this content negotiation and explicitly request a particular result format. When a table (with optional locator and selector) is followed by an extension, the output format is determined via typical association to well-known MIME types. If the extension is not known, it is assumed to be text/ext where ext is the extension provided in the URI.

Further, a path segment starting with the resource indicator (~) means that the request is not to be processed by HTSQL, but instead should be delegated to an underlying application.


/op:project[meyers]{id(),name}.csv

The result format can be specified as an argument to select or other operations. In this case, the standard comma-separated variable format is chosen. The output format can also be provided as an argument to the select(format='text/csv') command.

      id(),name
      MEYERS,Meyers' Residence

In the CSV format, results exactly match the SQL query (and are appropriately de-normalized). The double-quote is a delimiter used if a column value contains a comma; two adjacent double-quotes are used to escape a double-quote occurring in the query results.



/op:project{name}/employee{}/task{task_no,status}.xml

In the eXtensible Markup Language ("XML") format, result elements are named after the table they represent. For each path context, a hierarchical relation is established, duplicating intermediate nodes (such as "aronson" above) as required. If a given table or column is not a valid element or attribute name, there are work-arounds using the htsql namespace.


      <?xml version="1.0" encoding="utf-8"?>
      <htsql:result xmlns:htsql="http://htsql.org/2006/"
                    htsql:schema="tm">
        <project htsql:id="meyers@3"
                 name="Meyer's Residence">
          <employee htsql:id="aronson@11">
            <task htsql:id="meyers.1@1"
                  task_no="      01" status="done" />
            <task htsql:id="meyers.23@1"
                  task_no="      23" status="review" />
            <!-- ... -->
          </employee>
          <employee htsql:id="adam@5">
            <task htsql:id="meyers.11@4"
                  task_no="      11" status="done" />
            <!-- ... -->
          </employee>
          <!-- ... -->
        </project>
        <project htsql:id="ssmall@9"
                 name="South Square Mall">
          <employee htsql:id="aronson@11">
            <task htsql:id="ssmall.13@1"
                  task_no="      01" status="started" />
            <!-- ... -->
          </employee>
          <!-- ... -->
        </project>
        <!-- ... -->
      </htsql:result>



/op:project{name}/employee{}/task{task_no,status}.yaml

The YAML format for HTSQL consists of two sections: a context and an assembly. For each table mentioned in the request, the context lists each row returned for that table. The assembly then represents the drill-down relationships between these rows. For large results, the content returned by YAML is much smaller than the equivalent XML, since each row occurs only once in the context rather than being duplicated for each occurrence in the hierarchical assembly.


      %YAML 1.1
      ---
      schema: tm
      context:
        project:
        - !project &1
          =: meyers@3
          name: Meyer's Residence
        - !project &2
          =: ssmall@9
          name: South Square Mall
        #...
        employee:
        - !employee &3
          =: aronson@1
        - !employee &4
          =: adam@5
        #...
        task:
        - !task &5
          =: meyers.1@1
          task_no: 01
          status: done
          project: *1
          employee: *3
        - !task &6
          =: meyers.23@1
          task_no: 23
          status: review
          project: *1
          employee: *3
          # ...
        - !task &7
          =: meyers.11@4
          task_no: 23
          status: done
          project: *1
          employee: *4
          # ...
        - !task &8
          =: ssmall.13@1
          task_no: 13
          status: started.
          project: *2
          employee: *3
        #...
      assembly: [*1: [*3: [*5, *6],
                    *4: [*7]],
                 *2: [*3: [*8]]]
      ...



/op:project{name}/task{task_no,status}/~some+resource

Whenever a path segment starts with a tilde ~, it indicates a user resource. Once found, the entire URI is not processed by HTSQL, but is instead passed on to the application. The application can then choose to return a resource specific to the path, or what ever it wishes. Note that relative paths work as expected, even for "static" resources. For example, a relative ./select() reference in the resource above would produce a list of tasks.

      (application defined resource for "some+resource")


/op:project{name}/task{task_no,status}.html~bing

Following a request for XML or HTML output, a style-sheet can be requested as shown above. If the style-sheet lacks an extension (such as "xsl"), then "css" is assumed.


      <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
      <HTML>
        <HEAD>
          <LINK type="text/css" rel="stylesheet"
           href="/op:project{name}/task{task_no,status}/~bing.css">
        </HEAD>
        <BODY>
          <TABLE>
            <COLGROUP ID="project">
              <COL CLASS="project project_name" ID="project:name">
            </COLGROUP>
            <COLGROUP ID="task">
              <COL CLASS="task task_task_no" ID="task:task_no">
              <COL CLASS="task task_status" ID="task:status">
            </COLGROUP>
            <THEAD>
              <TR>
                <TH COLSPAN="1" SCOPE="colgroup"
                    CLASS="project">Project</TH>
                <TH COLSPAN="2" SCOPE="colgroup"
                    CLASS="task">Task</TH>
              </TR>
              <TR>
                <TH>Name</TH>
                <TH>Task No</TH>
                <TH>Status</TH>
              </TR>
            </THEAD>
            <TBODY>
              <TR ID="task:meyers.1">
                <TD ROWSPAN="2" SCOPE="rowgroup" VALIGN="top">
                  Meyers' Residence</TD>
                <TD>      01</TD>
                <TD>done</TD>
              </TR>
              <TR ID="task:meyers.2">
                <TD>      02</TD>
                <TD>done</TD>
              </TR>
              <!-- ... -->
              <TR ID="task:ssmall.1">
                <TD>South Square Mall</TD>
                <TD>      01</TD>
                <TD>review</TD>
              </TR>
              <!-- ... -->
            </TBODY>
          </TABLE>
        </BODY>
      </HTML>


The varied output formats and customizable resources, together with stylesheet linkage, create a flexible mechanism for constructing user interfaces.



 TOC 

2.10.  HTML FORM Compatibility

Standard HTML form encoding presents several challenges to HTSQL. While it is possible to provide almost any name (such as one corresponding to a specifier) for an INPUT control, user-provided values are always percent-encoded instead of being single-quoted as HTSQL requires. Further, HTML only supports a flat expression structure, conflating the ampersand's meaning of conjunction with simply implying that additional form elements are provided. HTSQL defines a literal value syntax and variable substitution to provide direct support for HTML forms without requiring client-side processing.


/op:project?name@~south+square

Any operator may be preceded by the at-sign (@) to indicate that the right-hand side is a percent-encoded value. A literal value provided in this syntax terminates at the next ampersand character or at the end of the request string. This request is equivalent to /op:project?name~'south square'.

      SELECT * FROM op.project
       WHERE LOWER(name) LIKE '%south square%'
      ORDER BY proj_id

      proj_id  | name               | description
      ---------+--------------------+-----------------------------
      SSMall   | South Square Mall  | two new store fronts
      ...

This literal value syntax permits values to be included according to standard web usage without requiring cumbersome single-quotes. In this syntax, the plus sign is used to encode the space (%20) and reserved characters must be percent-encoded.



/tm:task?status@=done&status@=review

When using the literal value syntax, multiple items joined by an ampersand are treated as a simple alternation. This is an ugly exception to the normal grammar, however it is needed to meet the expectations of standard HTML form usage. The request above is equivalent to /tm:task?status='done','review'.

      SELECT * FROM tm.task
       WHERE htsql_normalize(status) in ('done','review')
      ORDER BY proj_id, task_no

      proj_id | task_no | assigned_to | status  | name
      --------+-----------------------+---------+-------------------
      MEYERS  |       1 | ARONSON     | done    | Purchase Materials
      ...

Since the comma is a reserved character, this syntax also allows expressions such as /tm:task?status@=done,review with an identical interpretation.



/tm:task?status='done',other&$other@:=review

In HTSQL, variables are declared with a dollar-sign. This example uses variable substitution to return tasks that are either done or ready for review. The variable other is first referenced as part of the filter on the status column. Following the ampersand, this variable is defined with the value "review", using the assignment operator.

      SELECT * FROM tm.task
       WHERE htsql_normalize(status) in ('done','review')
      ORDER BY proj_id, task_no

      proj_id | task_no | assigned_to | status  | name
      --------+-----------------------+---------+-------------------
      MEYERS  |       1 | ARONSON     | done    | Purchase Materials
      ...

This query could be submitted via a HTML form with a hidden input status having value 'done',other, and another input named $other@: with user-provided value review. After variable resolution, this request is equivalent to /tm:task?status='done','review'.



POST /tm:task?assigned_to=='ARONSON'
with URL-encoded post body status@=review&status@=done

When POST is used with a "multipart/form-data" or "application/x-www-form-urlencoded" MIME types, key/value pairs are integrated as if they used an ampersand.

      SELECT * FROM tm.task
       WHERE htsql_normalize(status) in ('done','review')
         AND assigned_to = 'ARONSON'
      ORDER BY proj_id, task_no

      proj_id | task_no | assigned_to | status  | name
      --------+-----------------------+---------+-------------------
      MEYERS  |       1 | ARONSON     | done    | Purchase Materials
      ...

The request above is equivalent to /tm:task?assigned_to=='ARONSON'&status='done','review'.


/person?organization.is_active&private_info.tax_ident~'3'

This request returns people who are members of an active organization, and who's tax identifier contains '3'. Tables such as private_info are called facets, since the join cardnality from the driving table person to the target is _0/1-to-1_.

      SELECT p.*
        FROM op.person AS p
             LEFT OUTER JOIN op.organizationn AS o
             ON (p.org_id = o.org_id)
             LEFT OUTER JOIN hr.private_info AS f
             ON (p._ppl_seq = f._ppl_id)
       WHERE o.is_active IS TRUE
         AND LOWER(f.tax_ident) LIKE '%3%'
      ORDER BY p.org_id,  p.nickname

      person
      ---------------------------------------------------------------
      org_id | nickname | full_name      | email
      -------+----------+----------------+---------------------------
      acorn  | hideo    | WATANABE Hideo | hideo.watanabe@example.com
      meyers | jim      | Jim Meyers     | jim.meyers@example.com

With the default HTSQL meta-data configurator, when only one foreign key reference exists to a table, the name of the referenced table (organizatione) can be used as well as the referencing column (client), i.e. /op:project?organization.is_active.



/tm:employee?count(task;status='done')>4

This request returns employees who have been assigned more than 4 completed tasks. The parameter filter (;) limits the related tasks, and the count() aggregate function converts this correlated sub-query into a scalar value for comparison.

       SELECT e.empl_code, e.full_name, e.is_contractor,
               e.email
        FROM tm.employee AS e
       WHERE
         (SELECT count(*)
            FROM tm.task AS t
           WHERE t.assigned_to = e.empl_code
             AND htsql_normalize(t.status) = 'done') > 4
      ORDER BY e.empl_code
       SELECT e.empl_code, e.full_name, e.is_contractor,
               e.email
        FROM tm.employee AS e
       WHERE
         (SELECT count(*)
            FROM tm.task AS t
           WHERE t.assigned_to = e.empl_code
             AND htsql_normalize(t.status) = 'done') > 4
      ORDER BY e.empl_code

By merging POST arguments with a URI using literal-value syntax and with clever use of variable substitution, it should be possible to send just about any HTSQL query using standard HTML form submission.



 TOC 

3.  References



 TOC 

3.1. Normative References

[ISO9075-1992] International Standards Organization, “Database Language SQL, 1992,” ISO/EIC 9075:1992, July 1992.
[RFC2234] Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” RFC 2234, November 1997 (TXT).
[RFC2616] Felding, R., Gettys, J., Mongul, J., Frystyk, H., Mastiner, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, June 1999 (TXT).
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifiers (URI): Generic Syntax,” RFC 3986, January 2005 (TXT).


 TOC 

3.2. Informative References

[CSV] “,” 2004.
[HTML] “,” 2004.
[HTML4] “,” 2004.
[ISO8601] “,” 2004.
[ISO9075-1999] International Standards Organization, “Database Language SQL, 1999,” ISO/EIC 9075:1999, September 1999.
[ISO9075-2003] International Standards Organization, “Database Language SQL, 2003,” ISO/EIC 9075:2003, August 2003.
[JSON] “,” 2004.
[PGSQL] “,” 2004.
[POSIX_1003.2] “,” 2004.
[REC-XML] “,” 2004.
[REST] Fielding, R., “Architectural Styles and the Design of Network-based Software Architectures.,” 2000.
[RFC1738] “,” 2004.
[RFC1808] “,” 2004.
[RFC2086] “,” 2004.
[RFC2396] “,” 2004.
[RFC2396BIS] “,” 2004.
[RFC2732] “,” 2004.
[SQL98] “,” 2004.
[UNICODE] “,” 2004.
[XML] “,” 2004.
[XSL] “,” 2004.
[YAML] “,” 2004.


 TOC 

Appendix A.  Collected ABNF for URI



 TOC 

Appendix B.  Sample Database Schema

--
-- SAMPLE SCHEMA, SQL-92 CONSTRUCTS ONLY
--
DROP SCHEMA tm CASCADE;
CREATE SCHEMA tm;

-- PostgreSQL, not SQL99, Syntax
CREATE OR REPLACE FUNCTION htsql_normalize(text)
  RETURNS text
  AS 'SELECT COALESCE(NULLIF(
              TRANSLATE(LOWER(
                TRIM(LEADING ''0'' FROM
                  TRIM(BOTH '' '' FROM
                    CAST($1 AS TEXT)
                  ))),
              ''- '',''__''),
           ''''),''0'');
  ' LANGUAGE SQL IMMUTABLE STRICT;

CREATE TABLE tm.project (
    proj_id        VARCHAR(16),
    name           VARCHAR(64) NOT NULL,
    description    VARCHAR(2000),
    CONSTRAINT project_pk
      PRIMARY KEY (proj_id)
);

CREATE TABLE tm.employee (
    empl_code      VARCHAR(16),
    full_name      VARCHAR(64) NOT NULL,
    is_contractor  BOOLEAN,
    email          VARCHAR(64),
    CONSTRAINT employee_pk
      PRIMARY KEY (empl_code)
);

CREATE DOMAIN tm.status AS VARCHAR(8);
ALTER DOMAIN  tm.status
   ADD CONSTRAINT status_check
     CHECK (VALUE IN ('done','review','planned'));

CREATE TABLE tm.task (
    proj_id        VARCHAR(16)
      REFERENCES tm.project(proj_id),
    task_no        INTEGER,
    assigned_to    VARCHAR(16)
      REFERENCES tm.employee(empl_code),
    status         tm.status,
    name          VARCHAR(64) NOT NULL,
    CONSTRAINT task_pk
      PRIMARY KEY (proj_id, task_no)
);

CREATE TABLE tm.restricted_info (
    empl_code      VARCHAR(16)
      REFERENCES tm.employee(empl_code),
    billing_rate   INTEGER NOT NULL,
    tax_id         VARCHAR(16) UNIQUE,
    birth_date     DATE,
    CONSTRAINT restricted_info_pk
      PRIMARY KEY (empl_code)
);

INSERT INTO tm.project VALUES ('MEYERS',
  'Meyer''s Residence', 'insulation and winterizing');
INSERT INTO tm.project VALUES ('SSMall',
  'South Square Mall', 'two new store fronts');
INSERT INTO tm.project VALUES ('THOM-LLP',
  'Tom Thompson, LLP.', 'fix up room for new associate');

INSERT INTO tm.employee VALUES ('ADAM',
  'Adam O''Brian', FALSE, 'adam@example.com');
INSERT INTO tm.employee VALUES ('ARONSON',
  'Mary Aronson', FALSE, 'mary2@example.com');
INSERT INTO tm.employee VALUES ('SMITH',
  'Ron Smith', TRUE, 'john@example.com');
INSERT INTO tm.employee VALUES ('SMITH-A',
  'Alfred Smith', TRUE, NULL);

INSERT INTO tm.task VALUES ('MEYERS',1,
  'ARONSON','done','Purchase Materials');
INSERT INTO tm.task VALUES ('MEYERS',2,
  'SMITH','review','Strip Wall Paint');
INSERT INTO tm.task VALUES ('MEYERS',3,
  NULL,'planned','Remove Refuse');
INSERT INTO tm.task VALUES ('SSMall',1,
  'ADAM','review','Install Slider Door');

INSERT INTO tm.restricted_info VALUES ('ARONSON',
  26, '222-22-1492', '03-01-1961');
INSERT INTO tm.restricted_info VALUES ('SMITH',
  22, '444-44-4444', '08-15-1965');



 TOC 

Index

C 
 command
   begin transaction
   delete
   insert
   interaction
   merge
   parse
   query
   select
   update
 content negotiation
 context 1, 2
   cross products 1, 2
   locators in
   parameters
   relative identifiers
   relative inserts
E 
 error
   301 Moved Permanently
   408 Request Timeout
   409 Conflict
   417 Expectation Failed
   501 Not Implemented
F 
 facet
 filter
   preview 1, 2
 function
   id()
   tag()
I 
 implicit conversion
L 
 locator
 locking
   explicit
   optimistic
M 
 meta data
   candidate key
   foreign key
O 
 operator
   assignment
   conjunction
   distinct-from
   equality
   implies
   negation
   regex
 operator precedence
 output format
   Comma Separated Variable
   eXtensible Markup Language
   HyperText Markup Language
   YAML Ain't Markup Language
P 
 parameter filter
 percent encoding
   preview
R 
 resources
   application defined
   style sheets
S 
 selector
   column pivot
   functions in
   nested
   wild
 specifier
   facet
   plural 1, 2
   singular
   transitive


 TOC 

Authors' Addresses

  Clark C. Evans
  Prometheus Research, LLC.
  315 Whitney Ave.
  New Haven, CT 06511
  US
Phone:  +1 734 418 8644
Email:  info@clarkevans.com
URI:  http://clarkevans.com
  
  Kirill Simonov
  Prometheus Research, LLC.
  18A, Tamanskiy pr.
  Donetsk, CT 83048
  UA
Phone:  +380.622.583571
Email:  xi@resolvent.net
URI:  http://resolvent.net