TOC |
|
HyperText Structured Query Language (HTSQL) is an extension to the HTTP/1.1 protocol that allows clients to remotely access a standard SQL database. This extension provides a mapping of HTTP requests into SQL statements, producing a response that corresponds to the result generated by the database. This document defines a URI scheme used for this mapping, together with a coherent set of HTTP methods, headers, and entity body formats.
1.
Introduction
1.1.
Audience
1.2.
Objective
2.
Preview
2.1.
Predicate Expressions
2.2.
Selectors and Functions
2.3.
Specifiers
2.4.
Locators and locations
2.5.
Aggregates
2.6.
Commands
2.7.
Path Contexts
2.8.
Transactions and Locking
2.9.
Resources and Formats
2.10.
HTML FORM Compatibility
3.
References
3.1.
Normative References
3.2.
Informative References
Appendix A.
Collected ABNF for URI
Appendix B.
Sample Database Schema
§
Index
§
Authors' Addresses
TOC |
HyperText Structured Query Language (HTSQL) is a mechanism for accessing industry standard SQL [ISO9075‑1992] (International Standards Organization, “Database Language SQL, 1992,” July 1992.) data sources over HTTP [RFC2616] (Felding, R., Gettys, J., Mongul, J., Frystyk, H., Mastiner, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” June 1999.). This document specifies a URI scheme [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifiers (URI): Generic Syntax,” January 2005.), HTTP methods and extensions which enable database access from standard web browsers. The principal advantage of HTSQL is the expression of queries in a concise web-friendly syntax: for for common database tasks, path-based URIs are both simple and intuitive. A secondary advantage of HTSQL is the integrated use of the HTTP protocol to provide authentication, data caching, encryption, content negotiation, and numerous other network operations. This approach to SQL-over-HTTP puts forth a reusable, application independent, and testable middleware layer which translates HTTP requests into SQL statements, returning the execution results to the user in a format their user-agent can handle.
/----------------\ HTTP Request /-------------------------\ | Web Browser | ----------------> --. | | * Direct URLs | URI, headers, | \ .--> Generated | | * HTML / XSLT | post/put body | . / SQL Query | | * Javascript | | HTSQL | | | GUI Front End | HTTP Response | / ^ v | | * Java / C# | <--------------- <--. \ DATABASE | | * Python, etc | status code, | Query | | | Excel /w Macro | headers, and | Results <..../ | | Command Line | formatted | | \----------------/ results \-------------------------/
An HTSQL processor accepts an HTTP Request, converts this request into an SQL query, executes this query against a relational database, and returns the result in a file format requested by the user agent.
TOC |
The target audience for HyperText Structured Query Language is not career programmers, nor is it casual users. HTSQL is designed for technical users including screen designers, database administrators, statisticians, medical researchers, and other "accidental programmers". HTSQL advances a human-friendly URI-based query syntax over traditional SQL queries, and HTTP over a more typical database access protocol. HTSQL puts a relational database "on the web" where it can be directly accessed by these expert users without wading through an application's chrome. HTSQL offers the following:
URIs are instantly familiar to users who have been using the web for many years; human-readable URIs provide direct control over database information which is often lacking in traditional systems.
A database accessable via a web browser with persistent URIs allows query results to be bookmarked and emailed to collaborators; such an interface also enables easy navigation using web browser controls.
HTSQL offers greater flexibility than a purely graphical user interface, which necessarly limits the kinds of retrievals that can be specified. With even moderate exposure, an advanced user can learn to modify a URI to achieve results beyond what a graphical interface may provide.
HTSQL builds upon existing standards. By using standard HTTP, the database access protocol need not be burdened with authentication, encryption, signatures, cache control, content types, detailed audit-trails, or other network-related issues.
HTSQL provides an additional layer for security policy enforcement. Since most database interactions can be expressed with an HTSQL request, direct access to the database is not required. HTSQL as middleware can be used to restrict access and monitor activity. HTSQL also provides a row-level permission mechanism.
Implicit in this approach is a compromise. While the majority of common database tasks can be expressed in human-readable URI format, a few will necessarly be elusive. In those uncommon cases, client-side processing or a server-side view/procedure is an alternative. While HTSQL is targeted for occassional programmers, it is also allows career programmers to quickly develop and deploy loosely coupled applications.
TOC |
While there are many approches for database access over the web, few of them directly support relational databases. Of those, most implementations are application specific, fail to achieve even basic coverage of SQL constructs, or do not take URIs and HTTP protocols into consideration. HTSQL aims to be generic, documented, and supportable. By providing a hybrid that is both application independent and human readable, HTSQL realizes the potential synergy of HTTP and SQL, putting the database itself "on the web".
HTSQL should support most SQL-92 operations, although full coverage of SQL functionality is explicitly not a goal. HTSQL will provide an extension mechanism for functions and commands as well as working well with server-side triggers, stored procedures and views.
HTSQL must support modern relational databases such as PostgreSQL, MySQL, SQLite, and even commercial offerings such as Oracle, Microsoft SQL Server, and IBM's DB2. It is not a goal to provide support for every feature of those databases: back-end portability across databases is desired.
For all but the most complicated database interactions, the corresponding URIs in HTSQL must be easy to read and understand. Since HTSQL is meant to be used by casual programmers, error messages must be informative and layered so that casual users are not frightened and expert users are given the details they need.
HTSQL must use HTTP/1.1 features for well-known operations such as authentication, caching, range requests and content negotiation; HTSQL should extend or augment existing HTTP/1.1 methods only when necessary.
HTSQL must allow for fine-grained access permissions as allowed by SQL-92, mapping application users onto specific database accounts and using database roles. HTSQL must permit further restrictions beyond those permitted by SQL and should work well with server-side trigger based checks.
HTSQL requests must be easily expressable from a web browser via an XSLT stylesheet or a Javascript expression. Further, client access libraries should be made available to make use of HTSQL easy within common programming languages such as C#, Python, Ruby, Perl and Java.
HTSQL must support the standard [HTML] (, “,” 2004.) FORM element for common database operations using HTTP/1.0, requiring full HTTP/1.1 only when necessary.
HTSQL should support a wide variety of query result formats, including JavaScript Object Notation [JSON] (, “,” 2004.) and the eXtensible Markup Language [XML] (, “,” 2004.) for standard Javascript and DOM enabled web browsers, as well as Comma Separated Variable [CSV] (, “,” 2004.) for spreadsheets and data analysis tools.
HTSQL should minimize configuration using catalog detail available in the database's information schema; basic functionality should not require supplementary information. Ideally, HTSQL should work out-of-the-box with existing database system deployments.
HTSQL should minimize server-side state; following as much as possible the principles of representational state transfer, [REST] (Fielding, R., “Architectural Styles and the Design of Network-based Software Architectures.,” 2000.). To allow usage of HTSQL from a browser's location bar, database updates may be submitted with the GET method; however, this relaxation can be disabled.
The HTSQL specification should permit usage of SQL-99 [ISO9075‑1999] (International Standards Organization, “Database Language SQL, 1999,” September 1999.) and SQL-2003 [ISO9075‑2003] (International Standards Organization, “Database Language SQL, 2003,” August 2003.) constructs when possible, but should not require these features for operation. HTSQL must support [UNICODE] (, “,” 2004.) and use [ISO8601] (, “,” 2004.) style dates.
TOC |
As a prelude to the formal specification, we provide a taste of HTSQL by presenting a limited set of URIs, associating each URI with an equivalent SQL expression. We will use, in sequence, three schemata that build upon each other to form a comprehensive task management system. The first example schema, op, deals with organizations, projects, and people.
+-------------------+ +-------------------+ | OP.PROJECT | | OP.ORGANIZATION | +-------------------+ +-------------------+ | prj_id PK |--\ /---| org_id PK |---\ ---\ | name NN,UK | | | | name NN | | | | status NN,CK | | | | is_active | | | | client FK |>---------/ | division_of FK |>--/ | | start_date | | . +-------------------+ . | | description | | . . | +-------------------+ | project is an organization | .| related to may be a division | a project has zero . | at most one of a larger | or more people who | organization organization | participate in it | | | +-------------------+ | +-------------------+ | | OP.PERSON | | | OP.PARTICIPATION | | +-------------------+ | +-------------------+ | | org_id FK,PK1 |>-------------/ | project FK,PK1 |>-/ | nickname NN,PK2 | . | _ppl_seq FK,PK2 |>-------| _ppl_seq NN,UK | . | billing_rate | . | full_name NN | each person +-------------------+ . | email UK | is part of . +-------------------+ exactly one a person participates in organization zero or more projects PK - Primary Key FK - Foreign Key NN - Not NULL UK - Unique Key CK - Check Constraint
This op schema supports several topological structures. We have trunk tables, organization and project. The organization table demonstrates a self-reference recursion and acts acts as a category to the project table. The person table is a branch of organization, often representing a "part-of" relationship. The participation table is a cross-product of project by person.
In this diagram, each table's primary key is marked with PK and any unique key is marked with UK. These candidate keys are a combination of columns which uniquely identify each row in the given table. Relationships between tables, drawn with a line, represent foreign key constraints. A foreign key defines a correspondence between rows in the source table (marked with FK) and rows in the referenced table. Mandatory columns are indicated NN; primary key columns are mandatory as well.
TOC |
An HTSQL request URI typically starts with a single table (optionally prefixed by a schema). Following a table expression is an optional filter expression, denoted with a question mark, to limit the rows returned.
/organizationThis request selects all rows from the organization table. By default, rows are ordered by the primary key, in this case, org_id. If the table name is not unique within the database catalog, the schema specifier can be included; since the organization table is within the op schema, this request could also be written as /op:organization.
SELECT * FROM op.organization ORDER BY org_id; op:organization --------------------------------------------------------------- org_id | name | is_active | division_of ------------+-------------------------+-----------+------------ acorn | Acorn Architecture | true | lake-apts | Lake Shore Apartments | true | lakeside lake-carmen | Lake Carmen Towers | true | lakeside lakeside | Lake Side Partners, LLC | | meyers | Meyers Construction | true | meyers_elec | Meyers Electric | false | meyers smith | Rudgen, Taupe, & Smith | false |
To provide a more traditional expression grammar, HTSQL's URI scheme diverges significantly from the common quoted-printable a=foo&b=bar URI scheme. Although a conjunction of key/value pairs is adequate for many uses, it is too limited for general query representation. Hence, HTSQL uses quoted literals instead of relying upon position and does not assume a top-level conjunction.
This request returns rows from person with a nickname of 'jack'. Following SQL conventions, string literals such as 'meyers' are single-quoted to distinguish them from column references, such as nickname. Like SQL, if a single-quote occurs in a value, it is escaped by doubling up, e.g. 'O''Mally'.
SELECT * FROM op.person WHERE nickname = 'jack' ORDER BY org_id, nickname op:person ------------------------------------------------------------ org_id | nickname | full_name | email -------+----------+----------------+------------------------ meyers | jack | Jack C. Meyers | jack.meyers@example.com smith | jack | Jack Taupe | jack.taupe@example.comWithin a single-quoted literal value, non-printable characters may be represented using [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifiers (URI): Generic Syntax,” January 2005.) precent-encoding. As such, the percent sign (%) must be written as %25. To be resilient against arbitrary percent-encoding done by a user-agent, even if encoded, a single-quote occurring in a literal value must still be escaped by doubling -- simply percent encoding is not sufficient. Hence, 'We''re already running at 110%25 capitan!'
/project?status!='abandoned'&name~'roof'In HTSQL, the conjunction (boolean AND) is indicated with the ampersand (&). This particular request returns returns rows from project where the status does not equal 'abandoned' and where the project's name matches a case-insensitive [POSIX_1003.2] (, “,” 2004.) extended regular expression 'roof'. Case sensitive regular expressions are indicated by doubling-up the tilde (~~).
SELECT * FROM op.project WHERE status != 'abandoned' AND LOWER(name) LIKE '%roof%' ORDER BY prj_id; project -------------------------------------------------------+ ... prj_id | name | status | -------+---------------------------------+-------------+ ... la-334 | Siding / roof at 334 Ocean Blvd | completed | lt-802 | Toaster Re-Do and Roof Leak | in-progress |Since comparison using SQL92's LIKE or SQL2003's SIMILAR-TO is easily represented as a regular expression, HTSQL makes no provision to support these operators. For databases that do not natively support regular expressions, common cases can be translated into a corresponding SQL construct as shown in the example above.
/project?status='planned'|status='in-progress'Alternation (boolean OR) is indicated with the vertical bar (|). When typing this URL into a standard browser, the vertical bar may be converted into its percent-encoded equivalent, (in this case %7C). Although this may hinder readability, it does not impact the interpretation of the URL. More generally, any character in HTSQL may be percent encoded without changing its meaning.
SELECT * FROM op.project WHERE status = 'planned' OR status = 'in-progress' ORDER BY proj_id project --------------------------------------------------+ ... prj_id | status | client | start_date | ---------+-------------+-------------+------------+ ... lt-711 | planned | lake-carmen | | lt-802 | in-progress | lake-carmen | 2006-11-23 | overhead | in-progress | | 2003-06-12 | ...The any() macro provides a syntax shorthand for applying a comparison operator over a list of values. The example above could be equivalently written: /project?status=any('planned','in-progress'),
/organization?is_active&division_of!=='meyers'This particular request returns organization records that are active and are not a division of 'meyers'. In HTSQL, boolean valued expressions don't need a corresponding comparison operator, hence is_active can be used directly. Note this result set includes rows where division_of is null.
SELECT * FROM op.organization WHERE is_active AND division_of IS DISTINCT FROM 'meyers' ORDER BY org_id; organization ------------------------------------------------------------ org_id | name | is_active | division_of ------------+---------------------+-----------+------------- acorn | Acorn Architecture | true | lake-carmen | Lake Carmen Towers | true | lakeside meyers | Meyers Construction | true | shoe | Rwyler's Shoes | true |Observe that the distinct (!==) and not-distinct (==) operator differ from equality in that they treat NULL as a comparable value. If the example above would have used not-equal, division_of!='meyers', then only one row, with lakeside would have been returned.
HTSQL uses implicit conversion when a non-boolean expression is used in a predicate without a comparison operator. This particular example returns project rows where the existence of a client implies that the project has not yet started. This operator makes it easier to express consequential logical operations in a clear manner.
SELECT * FROM op.project WHERE NOT (client IS NOT NULL AND client != '') OR (NOT (start_date IS NOT NULL)) ORDER BY prj_id project ------------------------------------------------------------- prj_id | client | start_date | name ---------+-------------+------------+------------------------ lt-711 | lake-carmen | | Updating Fire Escape overhead | | 2003-06-12 | General Management WorkExpressions which evaluate to the empty string, to a zero value, or to NULL are considered false. In the example above the text-valued column client is equivalent to (!isnull(client)&client!=''). This particular example also has the implies operator, where a->b is syntax sugar for !a|b.
/project?(start_date<'2004-04-01'|start_date>'2006-09-01')&clientParentheses can be used to group boolean expressions. This request returns projects that have not only been assigned to a client, but also either happen before '2004-04-01' or after '2006-09-01'
SELECT * FROM op.project WHERE (start_date < '2004-04-01' OR start_date > '2006-09-01') AND (client IS NOT NULL AND client != '') ORDER BY prj_id project ----------------------------------------------- prj_id | start_date | client | status -------+------------+-------------+------------ la-102 | 2004-03-27 | lake-apts | completed lt-802 | 2006-11-23 | lake-carmen | in-progressAs in SQL, conjunction has higher precedence than the alternation. The parentheses above are necessary; if omitted, it would return all projects before 2004-03-27 including ones not assigned to a client.
With moderate exposure to these sorts of URIs and the corresponding results, the occasional programmer should be able to combine operators in a generative manner without assistance. Filter expressions can be combined with boolean operators and grouped with parentheses to generate arbitrarily complex predicates.
TOC |
A sequence of expressions enclosed in curly brackets, called a selector, represents a set of correlated columns from related tables. When a selector immediately follows a table reference, it specifies which columns should be returned. Each expression a selector may be followed by a plus or a minus sign to indicate an ascending or descending sort order.
/project{status+,start_date-,name}The selector above names three columns, ordered ascending by status and then descending by start_date. A third column, name, is returned, but is not used for sorting. HTSQL makes no provision to order results by columns that are not returned, or to list columns in an order that differs by their appearance in the sort criteria.
SELECT status, start_date, name FROM op.project ORDER BY status ASC, start_date DESC, prj_id ASC project ----------------------------------------------------------- status | start_date | name ------------+------------+--------------------------------- abandoned | 2006-08-03 | Smith Associate Window and Roof completed | 2005-08-09 | Smith Entry and Waiting Room completed | 2005-02-03 | Smith Balcony Expansion ...
/{today(),null(),true(),false(),pi()}In HTSQL, functions, such as today() as well as many constants, such as null(), are represented in standard functional notation. In the request above, a table identifier is absent, so a single row is returned with the expression requested.
SELECT CURRENT_DATE, NULL, TRUE, FALSE, PI() ------------------------------------------------------ today() | null() | true() | false() | pi() ------------+--------+--------+---------+------------- 2008-05-22 | | true | false | 3.14159265359
/project{prj_id,string(start_date)[-5:].replace('-','/')}HTSQL has full complement of string, numeric, and date functions which are translated into equivalent SQL. Most are direct translations, however, some expressions such as the slice operator (inspired from Python), have a more succinct syntax than the corresponding SQL.
SELECT prj_id, REPLACE( SUBSTRING(CAST(start_date AS TEXT) FROM (((- 5) + ( CASE WHEN (- 5) < 0 THEN CHARACTER_LENGTH(CAST(start_date AS TEXT)) ELSE 0 END))+1) ), '-', '/') FROM op.project ORDER BY prj_id project | -------------------------------------------------------+- prj_id | string(start_date)[(-5):].replace('-', '/') | ---------+---------------------------------------------+- la-102 | 03/27 | la-334 | 04/20 | ...
/project{prj_id,recode(status,'completed','done')}HTSQL as full support for SQL's CASE statement depending upon the syntax variant. The if() function corresponds to the general form, while switch() is translated into the initial expression variant. To reduce expression duplication error in request construction, HTSQL introduces an additional recode function that returns the initial expression in the case where it is not matched.
SELECT prj_id, (CASE status WHEN 'completed' THEN 'done' ELSE status END) FROM op.project ORDER BY prj_id project | -------------------------------------------------+- prj_id | recode(status, 'completed', 'done') | -----------+-------------------------------------+- Bowl-Shoes | planned | la-102 | done |
/project{prj_id,start_date+time('03:30')+timedelta(365*4,1,5)}Standard date/time arithmetic is supported, with direct translations into the SQL equivalent. A numeric form of for timedelta is provided for convenience for use with contexts requiring arithmetic operations.
SELECT prj_id, ((start_date + TIME '03:30:00')+ CAST(( CAST(365 * 4 AS TEXT) || 'D ' || LPAD(CAST(1 AS TEXT), 2, '0') || ':' || LPAD(CAST(5 AS TEXT), 2, '0') || ':' || '00' ) AS INTERVAL)) FROM op.project ORDER BY prj_id project --------------------------------------------------------------- prj_id | ((start_date+time('03:30'))+timedelta((365*4), 1, 5)) --------+------------------------------------------------------ la-102 | 2008-03-26 04:35:00 la-334 | 2008-04-19 04:35:00 ...
/participation{floor(billing_rate div 2),(billing_rate mod 3)}For division and modulo a syntax exception is needed. The usual forward-slash (/) cannot be used since it designates path segments, while the percent sign (%) is inconvenient due to it's use as an escape character. To work around this syntactical inconvenience, HTSQL follows XSLT's usage of using the div and mod keywords respectively.
SELECT FLOOR(billing_rate / 3), billing_rate % 3 FROM op.participation ORDER BY ... participation | ---------------------------------------------------+- floor((billing_rate div 3)) | (billing_rate mod 3) | ----------------------------+----------------------+- 18 | 1.00 | 23 | 0.00 | ...The equivalent SQL query shown above is not completely accurate; in particular, additional default outer joins are not shown and the default order by clause requires additional explanation.
TOC |
HTSQL provides a mechanism for referencing columns not only in the current table, as seen in prior examples, but also columns from related tables. In SQL, relationships between tables are declared with a foreign key constraint; a specifier associates rows from related tables by joining on these constraints. Specifiers are written as sequence of column and/or table names separated by periods, each period representing a join. Specifiers with two or more periods reflect a transitive join, forming a path from one table to another through intermediates.
This request returns project rows that are assigned to an active client. Because there is a foreign key reference from client in the project table to org_id of the organization table, the HTSQL processor automatically constructs the appropriate join logic.
SELECT p.* FROM op.project AS p LEFT OUTER JOIN op.organization AS o ON (p.client = o.org_id) WHERE o.is_active IS TRUE ORDER BY prj_id project ------------------------------------------------- ... prj_id | status | client | start_date | -------+-------------+-------------+------------+ ... lt-711 | planned | lake-carmen | | lt-802 | in-progress | lake-carmen | 2006-11-23 |In HTSQL, these joins are singular, meaning that for every row of the driving table project there is at most one row from the linked table, organization. With the default HTSQL meta-data configurator, when only one foreign key reference exists to a table, the name of the referenced table (organization) can be used as well as the referencing column (client), e.g., /op:project?organization.is_active.
/participation?person.organization.is_activeThis request returns participation records for people who are in active organizations. This is accomplished by a transitive application of two joins, one from participation to person, and then onto organization. The resulting link chain is also singular since both sub-links are singular.
SELECT x.* FROM op.participation AS x LEFT OUTER JOIN op.person AS p ON (x._ppl_seq = p._ppl_seq) LEFT OUTER JOIN op.organization AS o ON (p.org_id = o.org_id) WHERE o.is_active IS TRUE ORDER BY t.project, o.org_id, p.nickname participation ------------------------------------- project | person | billing_rate ---------+-------------+------------- la-102 | meyers.hill | 55.00 la-102 | meyers.jack | 69.00 ...The equivalent SQL query and result set shown above are not completely accurate; in particular, the person column and the order by clause require additional explanation covered in the next section.
/person?organization.division_of.org_id='lakeside'Of course, a transitive join to the same table is possible. This request returns people who are in an organization that is a division of 'lakeside'.
SELECT p.* FROM op.person AS p LEFT OUTER JOIN op.organization AS o1 ON (p.org_id = o1.org_id) LEFT OUTER JOIN op.organization AS o2 ON (o1.org_id = o2.org_id) WHERE o2.is_active IS TRUE ORDER BY p.org_id, p.nickname person ---------------------------------------------+- org_id | nickname | full_name | email | ----------+----------+---------------+-------+- lake-apts | tom | Tommy O'Mally | | ...A future expansion for HTSQL may provide a syntax and corresponding translation for recursive queries, as implemented by SQL's WITH clause. Suggestions are welcome since this is a particularly useful, but very complicated construct to specify.
In a manner like SQL, all columns can be requested in a selector using the asterix (*). In this case, the columns returned are prefixed using the specifier of the join construct.
SELECT p.*, o.org_id AS "organization.empl_code", e.name AS "organization.name", e.is_active AS "organization.is_active", e.division_of AS "organization.division_of" FROM op.project AS p LEFT OUTER JOIN op.organization AS o ON (o.org_id = p.client) ORDER BY p.prj_id project ------------------------ . --------------------- ... prj_id | status | . |organization.org_id ... ---------+-------------+ . +-------------------- ... la-102 | completed | . |lake-apts ... la-334 | completed | . |lake-apts ... lt-711 | planned | . |lake-carmen ... ...
/project{prj_id,status,client{name,is_active}}In this example, both the name and is_active columns from the organization table are requested. Using a nested selector in this manner avoids duplicating table references. The selector client{name,is_active} is short-hand for client.name,client.is_active.
SELECT p.prj_id, p.status, o.name AS "client.name", o.is_active AS "client.is_active" FROM op.project AS p LEFT OUTER JOIN op.organization AS o ON (p.client = o.org_id) ORDER BY p.prj_id project --------------------------------------------------------------- prj_id | status | client.name | client.is_active ---------+-------------+--------------------+------------------ lt-711 | planned | Lake Carmen Towers | true lt-802 | in-progress | Lake Carmen Towers | true ...
/organization?project.status='completed'This request returns organizations that are associated with at least one completed project. Since there may be more than one project for a given organization, HTSQL interprets the this request as an implied existence test.
SELECT o.* FROM op.organization AS o WHERE EXISTS (SELECT * FROM op.project AS p WHERE p.client = o.org_id AND p.status = 'completed') ORDER BY e.org_id organization ------------------------------------------------------------ org_id | name | is_active | division_of ----------+------------------------+-----------+------------ lake-apts | Lake Shore Apartments | false | lakeside smith | Rudgen, Taupe, & Smith | false | ...Unlike the previous examples, the cardinality of this join is plural: there may be more than one project associated with given organization. When used in this manner, such a specifier is called plural and it checks for existence of at least one matching row.
HTSQL's specifier mechanism enables intuitive construction of complicated join criteria. Not only are singular (one-to-one or many-to-one) joins allowed, but plural (one-to-many) joins are also supported.
TOC |
HTSQL provides explicit support for selecting particular rows of a given table using primary key columns. When using this syntax, each value associated with a primary key column is called a label, and a dotted sequence of labels is called an location. Labels are compared via usual equality as described above. HTSQL uses square brackets to enclose a comma-separated list of locations that locate rows within the database. A sequence of locations enclosed in square brackets is called a locator.
/organization[meyers]This request returns the meyers organization. This locator syntax form provides a short-hand for filtering by primary key and is indentical to the more formal /op:organization?org_id='meyers'.
SELECT * FROM op.organization WHERE org_id = 'meyers' ORDER BY org_id proj_id | name | description ---------+--------------------+----------------------------- MEYERS | Meyers' Residence | insulation and winterizing
/person[meyers.jim]The full-stop (.) is used used to separate labels in cases where the primary key includes more than one column. This example is identical to /op:person?organization.org_id='meyers' &nickname='jim'
SELECT p.* FROM op.person AS p WHERE p.org_id = 'meyers' AND p.nickname='jim' ORDER BY p.org_id, p.nickname person ------------------------------------------------------- org_id | nickname | full_name | email -------+----------+------------+----------------------- meyers | jim | Jim Meyers | jim.meyers@example.com
/person[meyers.jim,meyers.hill]A comma (.) may be used to separate additional locators The example above is identical to, /op:person?organization.org_id='meyers'& (nickname='jim'|nickname='hill') and could also be expressed a bit shorter as /person[meyers.(jim,hill)]
SELECT p.* FROM op.person AS p WHERE p.org_id = 'meyers' AND p.nickname IN ('jim','hill') ORDER BY p.org_id, p.nickname person --------------------------------------------------------------+- org_id | nickname | full_name | email | -------+----------+------------------+------------------------+- meyers | hill | Mark Thomas Hill | mark.hill@example.com | meyers | jim | Jim Meyers | jim.meyers@example.com |
/person[meyers.*]Components in an identifier may be unknown, replaced instead with the wildcard (*). Any component in a location may be set to a wildcard.
SELECT p.* FROM op.person AS p WHERE p.org_id = 'meyers' ORDER BY p.org_id, p.nickname person | ---------------------------------------------------------------+- org_id | nickname | full_name | email | -------+----------+------------------+-------------------------+- meyers | hill | Mark Thomas Hill | mark.hill@example.com | meyers | jack | Jack C. Meyers | jack.meyers@example.com | meyers | jim | Jim Meyers | jim.meyers@example.com | ...
This row locator can returned using a built-in function, id() which does the appropriate construction.
SELECT (p.org_id || '.' || p.nickname) AS "id()", p.* FROM op.person AS p ORDER BY p.org_id, p.nickname person --------------------------------------------------------- ... id() | org_id | nickname | full_name | --------------+-----------+----------+------------------+ ... acorn.hideo | acorn | hideo | WATANABE Hideo | lake-apts.tom | lake-apts | tom | Tommy O'Mally | ...The id() function is a bit more complicated than this since the locator production is limited to word characters and the dash (-) and underscore (_). Labels which do not match this production must be single-quoted.
Identifiers provide a handy notation for resource location: they are concise and unique. Identifiers can be explicitly requested in the selector using id(); furthermore, they can be used within a locator to return exactly one row.
TOC |
In HTSQL, a command denoted by parentheses may be used in the last (right most) path-segment of the URI to invoke a specific database operation or extension function. In the previous examples, the command select() was implicit.
/op:project{proj_id,task.task_no}If a plural specifier (which causes a one-to-many join) is used within a selector, an ARRAY is returned. This example returns a row for each project, and for each row, an array of associated task numbers.
SELECT p.proj_id, ARRAY(SELECT t.task_no FROM tm.task t WHERE t.proj_id = p.proj_id ORDER BY t.proj_id, t.task_no) AS "task{task_no}" FROM op.project AS p ORDER BY p.proj_id; proj_id | task{task_no} ---------+----------------------------- MEYERS | {1,2,3} SSMall | {1} THOM-LLP | {} ...
/tm:employee{full_name,isnull(email),count(task)}Functions may be used in the context of a selector. In this example, isnull() is a scalar function while count() is an aggregate function. Aggregate functions may be applied to plural specifiers, such as task.
SELECT e.full_name, (e.email is null) AS "isnull(email)", count(t.*) AS "count(task)" FROM tm.employee AS e LEFT OUTER JOIN tm.task AS t ON (t.assigned_to = e.empl_code) GROUP BY e.empl_code, e.full_name, (e.email is null) ORDER BY e.empl_code; full_name | isnull(email) | count(task) ----------------+---------------+------------- Adam O'Brien | FALSE | 23 ... Alfred Smith | TRUE | 0 ...In general, most SQL operations such as IS NULL are available in HTSQL. However, we use a simplified function notation that is more familiar to occasional programmers; our syntax includes sequential argument passing an optional keyword parameters.
/tm:task{assigned_to,@status|count(*)}Data pivoting can be requested using the asterix (@) immediately preceding a column, such as status. In this case, distinct values in that column become the headers in the result set, and remaining columns become the values within respective buckets.
: SELECT t.assigned_to, SUM(CASE WHEN t.status = 'done' THEN 1 ELSE 0 END) AS "done", ... SUM(CASE WHEN t.status = 'review' THEN 1 ELSE 0 END) AS "review", FROM tm.task AS t GROUP BY t.assigned_to ORDER BY t.assigned_to; assigned_to | done | . | review ------------+------+ . +-------- ADAM | 0 | . | 1 ARONSON | 4 | . | 2 ... (NULL) | 0 | . | 0This translation need not be done in SQL as shown above, since a pivot operation is strictly a visualization mechanism. The equivalent SQL above reflects an idiom for this sort of item. If more than one summary calculation or pivot is requested, there may be 2 or more header rows.
If a plural specifier naming a table is mentioned in a selector, then an ARRAY of associated identifiers is generated. In this example, a row for each employee is returned, and for each row, an array of associated task identifiers is listed.
SELECT e.empl_code, ARRAY(SELECT (t.proj_id || '.' || t.task_no) FROM tm.task t WHERE t.assigned_to = e.empl_code ORDER BY t.proj_id, t.task_no) AS "task" FROM tm.employee AS e ORDER BY e.empl_code empl_code | task ----------+----------------------------- ADAM | {ssmall.1, ... } ARONSON | {meyers.1, meyers.5, ... } ... SMITH-A | {} ...
Identifiers provide a handy notation for resource location: they are concise and unique. Identifiers can be explicitly requested in the selector using id(); furthermore, they can be used within a locator to return exactly one row.
TOC |
In HTSQL, a command denoted by parentheses may be used in the last (right most) path-segment of the URI to invoke a specific database operation or extension function. In the previous examples, the command select() was implicit.
/tm:employee/select(offset=10,limit=2)The explicit select() command has two optional keyword/value arguments which can be used to return a sliding window over a result set. For this particular example, the result set starts at the 11th row and returns 2 rows.
SELECT * FROM tm.employee ORDER BY empl_code OFFSET 10 LIMIT 2 /* PostgreSQL Syntax */ empl_code | full_name | is_contractor | email ----------+---------------+---------------+------------- HUCK | Ed Huckington | TRUE | huck@example.com JACK | Jack Winters | FALSE | jack@example.comUnfortunately, ISO SQL does not have a provision for offset/limit as implemented by PostgreSQL and other databases. However, the comments section in the SQL2003 hints that this feature is forthcoming. Since this feature is extremely useful and has implementations in just about every database, it is included in the HTSQL specification.
/op:project/insert()?proj_id:='ALBE'&name:='Alberca'Other commands for insert, update, and delete follow a similar syntax, using colon-equal (:=) to indicate an assignment. The result from an insert statement is a "201 Created" with a content body containing the URI(s) of the objects inserted.
INSERT INTO op.project (proj_id, name) VALUES ('ALBE','Alberca'); 201 Created /op:project[albe@1]For RESTful behavior, the POST method with query arguments in a standard entity body format will also work; the GET method is permitted so that standard features can be used directly in a web browser's location bar.
/op:project/update()?proj_id='ALBE'&description:='Leaky Pool'This request updates the description of the 'ALBE' project. Note that only columns using the assignment operator (:=) are changed; the remaining column references are used to limit the rows affected.
UPDATE op.project SET description = 'Leaky Pool' WHERE htsql_normalize(proj_id) = 'albe' 201 Created /op:project[albe@2]Like the insert() operation, update() returns a "201 Created" when successful, listing the URIs of the affected resources. Even though the affected rows are actually modified, they constitute a new resource: implementations could permit access to previous versions of the modified row.
/op:project/delete(expect=3)?description~'pool'If a data modification request would change more than one row (or less than one row), an expect keyword argument is needed. In this example, the request expects exactly 3 rows to be deleted.
DELETE FROM op.project WHERE LOWER(description) LIKE '%pool%'; 204 No ContentDeleting a row does not return content. If any more or fewer rows would be affected, a "417 Expectation Failed" is returned and the data modification request is aborted.
/op:project[able]/merge()?name:='SouthWest%20Alberca'The merge operation provides a succinct syntax for adding or updating a resource based on its identifier. If the row already exists, it is updated, otherwise it is created. The result of this command is a "201 Created" when the corresponding insert or update succeeds.
MERGE INTO op.project USING op.project ON htsql_normalize(proj_id) = 'albe' WHEN MATCHED THEN UPDATE SET name = 'SouthWest Alberca' WHEN NOT MATCHED THEN INSERT (proj_id, name) VALUES ('albe', 'SouthWest Alberca'); 201 Created /op:project[albe@3]The MERGE syntax above comes from SQL2003. However, the concept is so useful in a web setting that the corresponding SQL1992 transaction can be simulated if this feature is not natively supported. While HTSQL does not require spaces to be encoded as %20, many user-agents do.
/tm:task{task_no,status}/parse()?employee='adam'The parse() command returns a parse tree representing the request. It is useful for clients which provide a graphical query builder. This command is also helpful when debugging because it shows how the HTSQL server is interpreting a given URI.
<parse xmlns="http://htsql.org/2006/"> <context schema="tm" table="task"> <selector> <specifier column="task_no" /> <specifier column="status" /> </selector> </context> <operation name="parse" /> <query> <comparison type="equality"> <specifier table="employee"> <literal value="adam" /> </comparison> </query> </parse>
HTSQL provides the standard SQL commands, insert(), update() and delete(). The merge() function is particularly useful in a web environment where the status of an object is being "reset" regardless of its previous state.
TOC |
In HTSQL, more than one table can be directly mentioned in a request, provided that the tables are related (perhaps transitively) by a foreign-key relationship. Each table, together with associated selectors and parameters is called a context.
This request returns tasks, grouping by assigned employee. This particular result is possible since there is an assigned_to foreign key from task to employee. Employees which do not have corresponding tasks, or tasks which are not assigned to an employee are not returned.
SELECT e.empl_code AS "employee{empl_code}", e.full_name AS "employee{full_name}", e.is_contractor AS "employee{is_contractor}", e.email AS "employee{email}", t.proj_id AS "task{proj_id}", t.task_no AS "task{task_no}", t.assigned_to AS "task{assigned_to}", t.status AS "task{status}", t.name AS "task{name}" FROM tm.task AS t JOIN tm.employee AS e ON (t.assigned_to = e.empl_code) ORDER BY e.empl_code, t.proj_id, t.task_no; employee{empl_code} | . | task{proj_id} | task{task_no} | . --------------------+ . +---------------+---------------+ ADAM | . | MEYERS | 7 | . ADAM | . | SSMall | 2 | . ARONSON | . | MEYERS | 1 | . ...
The example above returns tasks that have been assigned to the employee aronson; duplicating information about this employee for each row. While this query is similar to /tm:task?assigned_to='aronson', it asserts that exactly one employee with identifier aronson is matched.
SELECT <all columns from employee and task> FROM tm.task AS t JOIN tm.employee AS e ON (t.assigned_to = e.empl_code) WHERE htsql_normalize(e.empl_code) = 'aronson' ORDER BY e.empl_code, t.proj_id, t.task_no employee{empl_code} | . | task{proj_id} | task{task_no} | . --------------------+ . +---------------+---------------+ ARONSON | . | MEYERS | 1 | . ...
/tm:employee?restricted_info.billing_rate>20This request returns employees who have a billing rate of more than 20. The automatic join here is possible since the restricted_info table has a foreign key which refers to the employee table.
SELECT e.* FROM tm.employee AS e JOIN tm.restricted_info AS r ON (r.empl_code = e.empl_code) WHERE r.billing_rate > 20 ORDER BY e.empl_code empl_code | full_name | is_contractor | email ----------+--------------+---------------+------------------- ARONSON | Mary Aronson | FALSE | mary2@example.com SMITH | Ron Smith | TRUE | john@example.com ...Tables, such as restricted_info, that have an optional one-to-one relationship with a primary table, like employee, are called facets. Facets allow the handling of sparse data sets and cases where particular sets of information must have additional security constraints.
Identifiers may be shortened when the context can be used to fully-qualify them. In this example, the identifier for the requested task is meyers.1.
SELECT <all columns from project and task> FROM op.project AS p JOIN tm.task AS t ON (t.proj_id = p.proj_id) WHERE htsql_normalize(p.proj_id) = 'meyers' AND htsql_normalize(t.task_no) = '1' ORDER BY p.proj_id, t.proj_id, t.task_no project{proj_id} | . | task{proj_id} | task{task_no} | . -----------------+ . +---------------+---------------+ . MEYERS | . | MEYERS | 1 | .
/op:project[meyers]/employee[aronson]/task/insert()?
task_no=4&name='Clean Up'When more than one table is provided for an insert() statement, un-ambiguous links to the rows identified in the context are assumed. In this case, the new task will be part of the meyers project and will be assigned to aronson.
INSERT INTO tm.task (proj_id, assigned_to, task_no, name) VALUES ((SELECT proj_id FROM op.project WHERE htsql_normalize(proj_id) = 'meyers'), (SELECT empl_code FROM tm.employee WHERE htsql_normalize(empl_code) = 'aronson'), '4', 'Clean Up')In HTSQL, whitespace between tokens (but not within single or double quotes) is not significant. To enhance readability, we broke the request above onto two lines.
/tm:employee{full_name}/task{project.name,task_no}?status='done'This request returns tasks that are 'done' together with detail regarding the assigned employee's full name, the project's name, and the task number. That tasks that are not assigned are not returned with this request.
SELECT e.full_name AS "employee{full_name}", p.name AS "task{project.name}", t.task_no AS "task{task_no}" FROM tm.task AS t JOIN tm.employee AS e ON (t.assigned_to = e.empl_code) JOIN op.project AS p ON (p.proj_id = t.proj_id) WHERE htsql_normalize(t.status) = 'done' ORDER BY e.empl_code, t.proj_id, t.task_no employee{full_name} | task{project.name} | task{task_no} --------------------+---------------------+-------------- Mary Aronson | Meyers' Residence | 1 ...
To suppress the automatic joins, a double-slash (//) may be used; the result is a cross product. In this example, the usual join using the foreign key assigned_to is explicitly suppressed to return all permutations of employee and task.
SELECT <all columns from employee and task> FROM tm.task AS t, tm.employee AS e ORDER BY e.empl_code, t.proj_id, t.task_no; employee{empl_code} | . | task{proj_id} | task{task_no} | . --------------------+ . +---------------+---------------+ ADAM | . | MEYERS | 1 | . ADAM | . | MEYERS | 2 | . .. ARONSON | . | MEYERS | 1 | . ...
/tm:$a:=project//$b=:project?a.proj_id[0]==b.proj_id[0]When using more than one copy of a table, a table alias created by $var:= is required. Once aliased, subsequent usage of the table is then referenced with var. This contrived example returns pairs of projects which have the same first letter in their project identifier.
SELECT a.proj_id AS "a{proj_id}", a.name AS "a{name}", a.description AS "a{description}", b.proj_id AS "a{proj_id}", b.name AS "a{name}", b.description AS "a{description}" FROM op.project AS a, op.project AS b WHERE SUBSTRING(a.proj_id FROM 1 FOR 1) = SUBSTRING(b.proj_id FROM 1 FOR 1) ORDER BY a.proj_id, b.proj_id; a{proj_id} | . | b{proj_id} | . -----------+ . +------------+ . MEYERS | . | MEYERS | . MEYERS | . | MIVDA | . MIVDA | . | MEYERS | . ...In HTSQL, an outer join is indicated with a plus ((+)) surrounded by parenthesis preceding the table name. The above request, for example, returns tasks even if they have not been assigned an employee. In these cases, the employee columns are NULL.
SELECT <all columns from employee and task> FROM tm.task AS t LEFT OUTER JOIN tm.employee AS e ON (t.assigned_to = e.empl_code) ORDER BY e.empl_code, t.proj_id, t.task_no; employee{empl_code} | . | task{proj_id} | task{task_no} | . --------------------+ . +---------------+---------------+ ADAM | . | SSMall | 2 | . ARONSON | . | MEYERS | 1 | . ... (NULL) | . | MEYERS | 3 | . ...
/tm:employee{full_name};is_contractor/(+)task{project.name,task_no}This request returns each contractor in the employee table and their associated tasks, if any. Filters on a particular context are indicated with a semi-colon (;), the context's parameters. In this case, the is_contractor filter applies to the employee table.
SELECT e.full_name AS "employee{full_name}", p.name AS "task{project.name}", t.task_no AS "task{task_no}" FROM tm.employee e LEFT OUTER JOIN tm.task AS t ON (t.assigned_to = e.empl_code) LEFT OUTER JOIN op.project AS p ON (p.proj_id = t.proj_id) WHERE e.is_contractor IS TRUE ORDER BY e.empl_code, t.proj_id, t.task_no employee{full_name} | task{project.name} | task{task_no} --------------------+--------------------+--------------- Ron Smith | Meyer's Residence | 2 Alfred Smith | (NULL) | (NULL) ...
This set of examples illustrates the large difference in readability between HTSQL URIs and the corresponding SQL. While SQL may be more expressive, for common needs, HTSQL is more succinct and understandable. The "path" based metaphor of URIs, together with the parameter syntax using the semi-colon, allows relatively common joins to be easily specified.
TOC |
HTSQL supports database transactions and record locking. Besides explicit row locking, bulk updates and optimistic locking are possible. Optimistic locking is accomplished by returning a row version; and then updating a row only when this version matches. Pessimistic locking requires either a session mechanism as provided by the application or its transaction is limited to the scope of the current HTTP connection (using "Connection: Keep-Alive" for HTTP/1.0 clients).
/tm:task{id(),tag(),idtag(),name}The tag() function returns, for each row, a string value that can be used to test when an update has occurred. In a subsequent request, if the value of tag() changes, then the row has been updated. The return value of tag() is usually a row version number or a timestamp that is changed on each update. The idtag() function returns a row's identifier and tag as a single value, separated with an at-sign (@).
SELECT (t.proj_id || '.' || t.task_no) AS "id()", <tag> AS "tag()", (t.proj_id || '.' || t.task_no || '@' || <tag>) AS "idtag()", t.name AS "name" FROM tm.task AS t ORDER BY t.proj_id, t.task_no;; id() | tag() | idtag() | name ---------+-------+------------+------------------- MEYERS.1 | 3 | MEYERS.1@3 | Purchase Materials MEYERS.2 | 7 | MEYERS.2@7 | Strip Wall Paint ...Since the notion of a tag is not specified by SQL92, its implementation may be dependent upon the specific database, schema, and configuration used. The only constraint placed upon tag() is that all future versions of updated rows must return a different value. If the HTSQL processor is unable to fulfill these requirements, it must respond with "501 Not Implemented".
This request attempts to return version 7 of the task identified by meyers.2. As above, if the task does not exist, then a "404 Not Found" is returned. If the tag() of the requested row has since changed, a "301 Moved Permanently" is issued, giving the location of the current version.
301 Moved Permanently /tm:task[meyers.2@8]
/op:project[meyers@2]/update()?name:='Changed%20Name'If an update or delete operation is applied to an object that has the incorrect version, then the update request would fail with a "409 Conflict". This message will include a reference to the most recent version, as well as a body with the change history (if available) since the version requested.
409 Conflict /op:project[albe@3]
The interaction() operation takes a POST body of type "text/htsql", each line being an HTSQL request. The requests are executed together as part of a single transaction; if any request fails with a 4xx or 5xx result code, then the entire transaction is rolled back. Following is an example request body which inserts a project and three tasks as a single transaction.
/op:project[waterfall]/insert()?name:='Waterfall Example' /op:project[waterfall]/task[1]/insert()?name:='Feasibility' /tm:task[waterfall.2]/insert()?name:='Analysis Work' /tm:task[$(project).3]/insert()?name:='Design Phase'Within text/htsql document, variables like $(table) resolves to the identifier for the most recent insert or merge statement of the given table. Hence, $(project) in this example is waterfall.
/op:project[waterfall]/task/insert()Bulk inserts can also be performed using a CSV file in the POST body. The first line of the file gives the columns, and remaining lines correspond to each row being insert. In this case, the project, waterfall is obtained from the parent context and doesn't need to be repeated in the CSV file.
task_no,status,name 4,planned,Implementation 5,planned,Testing 6,planned,DeploymentA similar file can be provided for update() or merge(), however, these actions require at least one id() or idtag() column to uniquely locate the row being affected.
This starts a transaction. An active transaction can be finished with either commit() or rollback(). The transaction is automatically abandoned if the current HTTP connection or the active session ends.
BEGIN TRANSACTION 204 No Content
/op:project/select(lock='update nowait')This select parameter places a row lock on the returned rows. If a lock cannot be obtained, it immediately returns a "408 Request Timeout" error. This command must be preceded with a begin().
SELECT * FROM op.project FOR UPDATE NO WAIT ORDER BY proj_id 408 Request TimeoutLocking for update is assumed when either lock="wait" or lock="nowait". An explicit lock is dropped after an explicit commit() or rollback().
HTSQL provides for standard optimistic locking, bulk transaction operations, as well as standard transactions with explicit locking.
TOC |
Normally, when a web browser makes a request, it provides a list of result formats it will accept. HTSQL provides mechanisms to override this content negotiation and explicitly request a particular result format. When a table (with optional locator and selector) is followed by an extension, the output format is determined via typical association to well-known MIME types. If the extension is not known, it is assumed to be text/ext where ext is the extension provided in the URI.
Further, a path segment starting with the resource indicator (~) means that the request is not to be processed by HTSQL, but instead should be delegated to an underlying application.
/op:project[meyers]{id(),name}.csvThe result format can be specified as an argument to select or other operations. In this case, the standard comma-separated variable format is chosen. The output format can also be provided as an argument to the select(format='text/csv') command.
id(),name MEYERS,Meyers' ResidenceIn the CSV format, results exactly match the SQL query (and are appropriately de-normalized). The double-quote is a delimiter used if a column value contains a comma; two adjacent double-quotes are used to escape a double-quote occurring in the query results.
/op:project{name}/employee{}/task{task_no,status}.xmlIn the eXtensible Markup Language ("XML") format, result elements are named after the table they represent. For each path context, a hierarchical relation is established, duplicating intermediate nodes (such as "aronson" above) as required. If a given table or column is not a valid element or attribute name, there are work-arounds using the htsql namespace.
<?xml version="1.0" encoding="utf-8"?> <htsql:result xmlns:htsql="http://htsql.org/2006/" htsql:schema="tm"> <project htsql:id="meyers@3" name="Meyer's Residence"> <employee htsql:id="aronson@11"> <task htsql:id="meyers.1@1" task_no=" 01" status="done" /> <task htsql:id="meyers.23@1" task_no=" 23" status="review" /> <!-- ... --> </employee> <employee htsql:id="adam@5"> <task htsql:id="meyers.11@4" task_no=" 11" status="done" /> <!-- ... --> </employee> <!-- ... --> </project> <project htsql:id="ssmall@9" name="South Square Mall"> <employee htsql:id="aronson@11"> <task htsql:id="ssmall.13@1" task_no=" 01" status="started" /> <!-- ... --> </employee> <!-- ... --> </project> <!-- ... --> </htsql:result>
/op:project{name}/employee{}/task{task_no,status}.yamlThe YAML format for HTSQL consists of two sections: a context and an assembly. For each table mentioned in the request, the context lists each row returned for that table. The assembly then represents the drill-down relationships between these rows. For large results, the content returned by YAML is much smaller than the equivalent XML, since each row occurs only once in the context rather than being duplicated for each occurrence in the hierarchical assembly.
%YAML 1.1 --- schema: tm context: project: - !project &1 =: meyers@3 name: Meyer's Residence - !project &2 =: ssmall@9 name: South Square Mall #... employee: - !employee &3 =: aronson@1 - !employee &4 =: adam@5 #... task: - !task &5 =: meyers.1@1 task_no: 01 status: done project: *1 employee: *3 - !task &6 =: meyers.23@1 task_no: 23 status: review project: *1 employee: *3 # ... - !task &7 =: meyers.11@4 task_no: 23 status: done project: *1 employee: *4 # ... - !task &8 =: ssmall.13@1 task_no: 13 status: started. project: *2 employee: *3 #... assembly: [*1: [*3: [*5, *6], *4: [*7]], *2: [*3: [*8]]] ...
/op:project{name}/task{task_no,status}/~some+resourceWhenever a path segment starts with a tilde ~, it indicates a user resource. Once found, the entire URI is not processed by HTSQL, but is instead passed on to the application. The application can then choose to return a resource specific to the path, or what ever it wishes. Note that relative paths work as expected, even for "static" resources. For example, a relative ./select() reference in the resource above would produce a list of tasks.
(application defined resource for "some+resource")
/op:project{name}/task{task_no,status}.html~bingFollowing a request for XML or HTML output, a style-sheet can be requested as shown above. If the style-sheet lacks an extension (such as "xsl"), then "css" is assumed.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <LINK type="text/css" rel="stylesheet" href="/op:project{name}/task{task_no,status}/~bing.css"> </HEAD> <BODY> <TABLE> <COLGROUP ID="project"> <COL CLASS="project project_name" ID="project:name"> </COLGROUP> <COLGROUP ID="task"> <COL CLASS="task task_task_no" ID="task:task_no"> <COL CLASS="task task_status" ID="task:status"> </COLGROUP> <THEAD> <TR> <TH COLSPAN="1" SCOPE="colgroup" CLASS="project">Project</TH> <TH COLSPAN="2" SCOPE="colgroup" CLASS="task">Task</TH> </TR> <TR> <TH>Name</TH> <TH>Task No</TH> <TH>Status</TH> </TR> </THEAD> <TBODY> <TR ID="task:meyers.1"> <TD ROWSPAN="2" SCOPE="rowgroup" VALIGN="top"> Meyers' Residence</TD> <TD> 01</TD> <TD>done</TD> </TR> <TR ID="task:meyers.2"> <TD> 02</TD> <TD>done</TD> </TR> <!-- ... --> <TR ID="task:ssmall.1"> <TD>South Square Mall</TD> <TD> 01</TD> <TD>review</TD> </TR> <!-- ... --> </TBODY> </TABLE> </BODY> </HTML>
The varied output formats and customizable resources, together with stylesheet linkage, create a flexible mechanism for constructing user interfaces.
TOC |
Standard HTML form encoding presents several challenges to HTSQL. While it is possible to provide almost any name (such as one corresponding to a specifier) for an INPUT control, user-provided values are always percent-encoded instead of being single-quoted as HTSQL requires. Further, HTML only supports a flat expression structure, conflating the ampersand's meaning of conjunction with simply implying that additional form elements are provided. HTSQL defines a literal value syntax and variable substitution to provide direct support for HTML forms without requiring client-side processing.
/op:project?name@~south+squareAny operator may be preceded by the at-sign (@) to indicate that the right-hand side is a percent-encoded value. A literal value provided in this syntax terminates at the next ampersand character or at the end of the request string. This request is equivalent to /op:project?name~'south square'.
SELECT * FROM op.project WHERE LOWER(name) LIKE '%south square%' ORDER BY proj_id proj_id | name | description ---------+--------------------+----------------------------- SSMall | South Square Mall | two new store fronts ...This literal value syntax permits values to be included according to standard web usage without requiring cumbersome single-quotes. In this syntax, the plus sign is used to encode the space (%20) and reserved characters must be percent-encoded.
/tm:task?status@=done&status@=reviewWhen using the literal value syntax, multiple items joined by an ampersand are treated as a simple alternation. This is an ugly exception to the normal grammar, however it is needed to meet the expectations of standard HTML form usage. The request above is equivalent to /tm:task?status='done','review'.
SELECT * FROM tm.task WHERE htsql_normalize(status) in ('done','review') ORDER BY proj_id, task_no proj_id | task_no | assigned_to | status | name --------+-----------------------+---------+------------------- MEYERS | 1 | ARONSON | done | Purchase Materials ...Since the comma is a reserved character, this syntax also allows expressions such as /tm:task?status@=done,review with an identical interpretation.
/tm:task?status='done',other&$other@:=reviewIn HTSQL, variables are declared with a dollar-sign. This example uses variable substitution to return tasks that are either done or ready for review. The variable other is first referenced as part of the filter on the status column. Following the ampersand, this variable is defined with the value "review", using the assignment operator.
SELECT * FROM tm.task WHERE htsql_normalize(status) in ('done','review') ORDER BY proj_id, task_no proj_id | task_no | assigned_to | status | name --------+-----------------------+---------+------------------- MEYERS | 1 | ARONSON | done | Purchase Materials ...This query could be submitted via a HTML form with a hidden input status having value 'done',other, and another input named $other@: with user-provided value review. After variable resolution, this request is equivalent to /tm:task?status='done','review'.
POST /tm:task?assigned_to=='ARONSON'
with URL-encoded post body status@=review&status@=doneWhen POST is used with a "multipart/form-data" or "application/x-www-form-urlencoded" MIME types, key/value pairs are integrated as if they used an ampersand.
SELECT * FROM tm.task WHERE htsql_normalize(status) in ('done','review') AND assigned_to = 'ARONSON' ORDER BY proj_id, task_no proj_id | task_no | assigned_to | status | name --------+-----------------------+---------+------------------- MEYERS | 1 | ARONSON | done | Purchase Materials ...The request above is equivalent to /tm:task?assigned_to=='ARONSON'&status='done','review'.
/person?organization.is_active&private_info.tax_ident~'3'This request returns people who are members of an active organization, and who's tax identifier contains '3'. Tables such as private_info are called facets, since the join cardnality from the driving table person to the target is _0/1-to-1_.
SELECT p.* FROM op.person AS p LEFT OUTER JOIN op.organizationn AS o ON (p.org_id = o.org_id) LEFT OUTER JOIN hr.private_info AS f ON (p._ppl_seq = f._ppl_id) WHERE o.is_active IS TRUE AND LOWER(f.tax_ident) LIKE '%3%' ORDER BY p.org_id, p.nickname person --------------------------------------------------------------- org_id | nickname | full_name | email -------+----------+----------------+--------------------------- acorn | hideo | WATANABE Hideo | hideo.watanabe@example.com meyers | jim | Jim Meyers | jim.meyers@example.comWith the default HTSQL meta-data configurator, when only one foreign key reference exists to a table, the name of the referenced table (organizatione) can be used as well as the referencing column (client), i.e. /op:project?organization.is_active.
/tm:employee?count(task;status='done')>4This request returns employees who have been assigned more than 4 completed tasks. The parameter filter (;) limits the related tasks, and the count() aggregate function converts this correlated sub-query into a scalar value for comparison.
SELECT e.empl_code, e.full_name, e.is_contractor, e.email FROM tm.employee AS e WHERE (SELECT count(*) FROM tm.task AS t WHERE t.assigned_to = e.empl_code AND htsql_normalize(t.status) = 'done') > 4 ORDER BY e.empl_codeSELECT e.empl_code, e.full_name, e.is_contractor, e.email FROM tm.employee AS e WHERE (SELECT count(*) FROM tm.task AS t WHERE t.assigned_to = e.empl_code AND htsql_normalize(t.status) = 'done') > 4 ORDER BY e.empl_code
By merging POST arguments with a URI using literal-value syntax and with clever use of variable substitution, it should be possible to send just about any HTSQL query using standard HTML form submission.
TOC |
TOC |
[ISO9075-1992] | International Standards Organization, “Database Language SQL, 1992,” ISO/EIC 9075:1992, July 1992. |
[RFC2234] | Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” RFC 2234, November 1997 (TXT). |
[RFC2616] | Felding, R., Gettys, J., Mongul, J., Frystyk, H., Mastiner, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, June 1999 (TXT). |
[RFC3986] | Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifiers (URI): Generic Syntax,” RFC 3986, January 2005 (TXT). |
TOC |
[CSV] | “,” 2004. |
[HTML] | “,” 2004. |
[HTML4] | “,” 2004. |
[ISO8601] | “,” 2004. |
[ISO9075-1999] | International Standards Organization, “Database Language SQL, 1999,” ISO/EIC 9075:1999, September 1999. |
[ISO9075-2003] | International Standards Organization, “Database Language SQL, 2003,” ISO/EIC 9075:2003, August 2003. |
[JSON] | “,” 2004. |
[PGSQL] | “,” 2004. |
[POSIX_1003.2] | “,” 2004. |
[REC-XML] | “,” 2004. |
[REST] | Fielding, R., “Architectural Styles and the Design of Network-based Software Architectures.,” 2000. |
[RFC1738] | “,” 2004. |
[RFC1808] | “,” 2004. |
[RFC2086] | “,” 2004. |
[RFC2396] | “,” 2004. |
[RFC2396BIS] | “,” 2004. |
[RFC2732] | “,” 2004. |
[SQL98] | “,” 2004. |
[UNICODE] | “,” 2004. |
[XML] | “,” 2004. |
[XSL] | “,” 2004. |
[YAML] | “,” 2004. |
TOC |
TOC |
-- -- SAMPLE SCHEMA, SQL-92 CONSTRUCTS ONLY -- DROP SCHEMA tm CASCADE; CREATE SCHEMA tm; -- PostgreSQL, not SQL99, Syntax CREATE OR REPLACE FUNCTION htsql_normalize(text) RETURNS text AS 'SELECT COALESCE(NULLIF( TRANSLATE(LOWER( TRIM(LEADING ''0'' FROM TRIM(BOTH '' '' FROM CAST($1 AS TEXT) ))), ''- '',''__''), ''''),''0''); ' LANGUAGE SQL IMMUTABLE STRICT; CREATE TABLE tm.project ( proj_id VARCHAR(16), name VARCHAR(64) NOT NULL, description VARCHAR(2000), CONSTRAINT project_pk PRIMARY KEY (proj_id) ); CREATE TABLE tm.employee ( empl_code VARCHAR(16), full_name VARCHAR(64) NOT NULL, is_contractor BOOLEAN, email VARCHAR(64), CONSTRAINT employee_pk PRIMARY KEY (empl_code) ); CREATE DOMAIN tm.status AS VARCHAR(8); ALTER DOMAIN tm.status ADD CONSTRAINT status_check CHECK (VALUE IN ('done','review','planned')); CREATE TABLE tm.task ( proj_id VARCHAR(16) REFERENCES tm.project(proj_id), task_no INTEGER, assigned_to VARCHAR(16) REFERENCES tm.employee(empl_code), status tm.status, name VARCHAR(64) NOT NULL, CONSTRAINT task_pk PRIMARY KEY (proj_id, task_no) ); CREATE TABLE tm.restricted_info ( empl_code VARCHAR(16) REFERENCES tm.employee(empl_code), billing_rate INTEGER NOT NULL, tax_id VARCHAR(16) UNIQUE, birth_date DATE, CONSTRAINT restricted_info_pk PRIMARY KEY (empl_code) ); INSERT INTO tm.project VALUES ('MEYERS', 'Meyer''s Residence', 'insulation and winterizing'); INSERT INTO tm.project VALUES ('SSMall', 'South Square Mall', 'two new store fronts'); INSERT INTO tm.project VALUES ('THOM-LLP', 'Tom Thompson, LLP.', 'fix up room for new associate'); INSERT INTO tm.employee VALUES ('ADAM', 'Adam O''Brian', FALSE, 'adam@example.com'); INSERT INTO tm.employee VALUES ('ARONSON', 'Mary Aronson', FALSE, 'mary2@example.com'); INSERT INTO tm.employee VALUES ('SMITH', 'Ron Smith', TRUE, 'john@example.com'); INSERT INTO tm.employee VALUES ('SMITH-A', 'Alfred Smith', TRUE, NULL); INSERT INTO tm.task VALUES ('MEYERS',1, 'ARONSON','done','Purchase Materials'); INSERT INTO tm.task VALUES ('MEYERS',2, 'SMITH','review','Strip Wall Paint'); INSERT INTO tm.task VALUES ('MEYERS',3, NULL,'planned','Remove Refuse'); INSERT INTO tm.task VALUES ('SSMall',1, 'ADAM','review','Install Slider Door'); INSERT INTO tm.restricted_info VALUES ('ARONSON', 26, '222-22-1492', '03-01-1961'); INSERT INTO tm.restricted_info VALUES ('SMITH', 22, '444-44-4444', '08-15-1965');
TOC |
C | |
command | |
begin transaction | |
delete | |
insert | |
interaction | |
merge | |
parse | |
query | |
select | |
update | |
content negotiation | |
context 1, 2 | |
cross products 1, 2 | |
locators in | |
parameters | |
relative identifiers | |
relative inserts | |
E | |
error | |
301 Moved Permanently | |
408 Request Timeout | |
409 Conflict | |
417 Expectation Failed | |
501 Not Implemented | |
F | |
facet | |
filter | |
preview 1, 2 | |
function | |
id() | |
tag() | |
I | |
implicit conversion | |
L | |
locator | |
locking | |
explicit | |
optimistic | |
M | |
meta data | |
candidate key | |
foreign key | |
O | |
operator | |
assignment | |
conjunction | |
distinct-from | |
equality | |
implies | |
negation | |
regex | |
operator precedence | |
output format | |
Comma Separated Variable | |
eXtensible Markup Language | |
HyperText Markup Language | |
YAML Ain't Markup Language | |
P | |
parameter filter | |
percent encoding | |
preview | |
R | |
resources | |
application defined | |
style sheets | |
S | |
selector | |
column pivot | |
functions in | |
nested | |
wild | |
specifier | |
facet | |
plural 1, 2 | |
singular | |
transitive |
TOC |
Clark C. Evans | |
Prometheus Research, LLC. | |
315 Whitney Ave. | |
New Haven, CT 06511 | |
US | |
Phone: | +1 734 418 8644 |
Email: | info@clarkevans.com |
URI: | http://clarkevans.com |
Kirill Simonov | |
Prometheus Research, LLC. | |
18A, Tamanskiy pr. | |
Donetsk, CT 83048 | |
UA | |
Phone: | +380.622.583571 |
Email: | xi@resolvent.net |
URI: | http://resolvent.net |