HTSQL -- a Query Language for Accidental Programmers

Presenter:Clark C. Evans, Prometheus Research
Date: November 3rd, 2010
Location:PGWest 2010, San Francisco, CA

HTSQL ("Hyper Text Structured Query Language") is a web-based, high-level query language for relational databases.

Today's Talk

Part I

  • Why is querying so hard?
  • Who am I? What is HTSQL?
  • HTSQL in 10 Minutes
  • HTSQL Applications
  • Q&A

Part II

  • Walk-Through
  • Final Q&A

Quite simply, I only need ...

Business user:

"I *just* need a list of schools and the number
of departments and programs for each?"

Developer:

"Got it! This one is easy -- give me 15 minutes."

Crickets:

*chrip*  *chirp*

it's easy, no?

A junior SQL developer might produce an obvious answer:

SELECT school.name,
       count(department),
       count(program)
FROM school
LEFT OUTER JOIN department
ON (department.school = school.code)
LEFT OUTER JOIN program
ON (program.school = school.code)
GROUP BY school.name

... that (luckily) also happens to be obviously incorrect.

Perhaps it's not so trivial?

SELECT school.name,
       coalesce(correlated_department.count,0),
       coalesce(correlated_program.count,0),
FROM school as school
LEFT OUTER JOIN (
   SELECT school, COUNT(1) as count
   FROM department GROUP BY school
) AS correlated_department
ON (correlated_department.school = school.code)
LEFT OUTER JOIN (
   SELECT school, COUNT(1) as count
   FROM program GROUP BY school
) AS correlated_program
ON (correlated_program.school = school.code)

But... it should be easy

This business user's request:

"I *just* need a list of schools and the number
of departments and programs for each?"

really is trivial. How about an alternative (plot):

/school{name, count(department), count(program)}

This is HTSQL -- a new query language designed from the bottom up to be usable by accidental programmers.

Who am I?

What is HTSQL?

HTSQL is...

img/what_is_it.png

HTSQL Processing Model

img/flow.jpeg

a bit more technical

Why did we make it?

img/triangle.png

We were dealing with users having extremely complex data -- traditional techniques for communication wasn't going to work.

A Shared Language

img/sharing.jpg

Our users, analysts, and developers now use a shared query language, while it's not a perfect understanding, it is much better.

Evolution of HTSQL

Self-Service Reporting

Developer Productivity

HTSQL in 10 Minutes

Let's assume we have a data model, with schools, departments, programs and courses. Here it is:

     +-------------+       +--------+
/---m| DEPARTMENT  |>-----o| SCHOOL |m----\
|.   +-------------+  .    +--------+    .|
| .                  .                  . |
|   department   department    a school   |
|   offers       may be part   has one or |
|   courses      of school     programs   |
|                                         |
|    +-------------+       +---------+    |
\---<| COURSE      |       | PROGRAM |>---/
     +-------------+       +---------+

#1 - Selecting Data

An example (E1):

/course{department as 'Dept Code'+, number as 'No.',
        credits-, title}

#2 - Linking Data

An example (E2):

/course{department{school.name, name}, * }

#3 - Filtering Data

For example (E3):

/course{department, number, title}?
   credits<3&department.school='ns'

#4 - Aggregating Data

An example (E4):

/school{code, avg(department.count(course))}

#5 - Projections & Segments

An example (E5):

/program{degree^, count()}/student

#6 - Pluggable Formatters

An example (E6):

/school/:txt

#7 - Table Expressions

An example (E7):

/department.sort(school).limit(10,5)

Canned .htsql Queries

This permits standard HTML forms:

<form action="/canned.htsql">
  Enter an parameter value:
  <input type="text" name="arg" />
</form>

JQuery / Data Binding

changes list box makes table auto-update:

<select id="school"
   data-source="/school{code, name"} />

<table id="department"
   data-source="/department?school=$school" />

Why HTSQL 2.0?

Temporary 2.0 Limitations

Systematic Considerations

How do I get it?

There are a few syntax changes (mostly with regard to commands) that we'll be changing in HTSQL 2.0.

Q&A

Please visit our community site, http://htsql.org, our commercial site http://htsql.com, or chat with Clark or Kyrylo here in Ann Arbor. We are best found at #htsql on freenode. We have regular Python releases, with source code at http://bitbucket.org/prometheus/htsql

Generous support for HTSQL was provided by Prometheus Research, LLC and The Simons Foundation. This material is also based upon work supported by the National Science Foundation under Grant #0944460.