duckdb-proxy

duckdb-proxy is a proxy for accessing DuckDB over HTTP. It exposes very simple and powerful endpoints that allow arbitrary execution of SQL statements and returns JSON results.

See a live demo.

Because of DuckDB's flexibility, (i.e. it's ability to directly query files on the filesystem), this proxy should not exposed directly to untrusted users.

Usage

Download duckdb-proxy from the project release page.

duckdb-proxy is configured via command line arguments

# can be run without any arguments, will open/create a "db.duckdb" file
$ ./duckdb-proxy

# can be given a path to an existing DuckDB database
# (the DB is created if it doesn't exist, but the directory must already exist)
$ ./duckdb-proxy db/mydatabase.duckdb

# open the database in readonly (DB must exist in this case)
$ ./duckdb-proxy --readonly db/mydatabase.duckdb

You can quickly test the proxy:

$ curl http://localhost:8012/api/1/exec \
	 -d '{"sql": "select version()"}'
{
 "cols": ["version()"],
 "types": ["varchar"],
 "rows": [
	 ["v0.8.1"]
 ]

Configuration

--help

Print help message and exit

--port PORT

The port for the HTTP server to listen on. default: 8012

--address ADDRESS

The address for the HTTP server to listen on. default: 127.0.0.1

--readonly

Opens the database in readonly mode (using the DuckDB access_mode=read_only configuration). When set, consider also setting --with_wrap.

--with_wrap

When set, duckdb-proxy will wrap the provided SQL in a CTE: with _dproxy as ($SQL) select * from _proxy. This will help limit the type of statements that duckdb-proxy will allow. When set, consider also setting --readonly.

--max_limit N

When set, --with-wrap is automatically enabled, and the wrapping SQL will include a limit N, ensure that no more than N rows are returned.

--external_access

Allows the database to access external state (using the DuckDB enable_external_access=BOOL configuration

--pool_size COUNT

The number of connections to the DB duckdb-proxy should keep alive. default: 50

--max_params COUNT

For a given call to /api/1/exec the maximum number of value that can be passed into params. default: no limit

--cors_origin ORIGIN

When set, duckdb-proxy will respond to CORS request using the specified origin (as well as any other CORS headers needed for duckdb-proxy to work). default none

--max_request_size BYTES

Will return an HTTP error if the request body is > BYTES. default: 65536

--log_level info | warn | error | fatal | none

The log level to write to use (duckdb-proxy is not overly verbose). default: info

--log_http

Enable logging HTTP requests (1 log per non-404 request)

Supported Types

duckdb-proxy is written in Zig and uses the DuckDB C API. The C API is missing support for binding and reading a few types. These types can still be used in the SQL query itself, but cannot be passed-in as parameters and/or returned as part of a result.

API

exec

Executes the specified SQL, returning the result as JSON.

POST /api/1/exec
nametypereqdesc
sqlstring

The SQL to execute

paramsarray

The parameters to bind to the SQL statement. Must have a value for each placeholder in the SQL statement.

curl "http://localhost:8012/api/1/exec" -d '{
  "sql": "select name,value from duckdb_settings() where name like $1",
  "params": ["debug%"]
}'
}
nametypedesc
colsstring array

List of column names.

rowsarray of arrays

The result as an array of arrays.

In the case of an insert, update or delete without a returning clause, a single row with the column Count will be returned, indicating the number of rows inserted, updated or deleted.

{
 "cols": ["name","value"],
 "rows": [
   ["debug_checkpoint_abort","none"],
   ["debug_force_external","false"],
   ["debug_force_no_cross_product","false"],
   ["debug_asof_iejoin","false"],
   ["debug_window_mode","NULL"]
 ]
}

As an alternative, POST /api/1/exec/$ID can also be used, where $ID is any abitrary value. This can be useful when --log_http is set in order to generate more meaningful access log (since the log's path attribute will contain the URL, including the $ID) .

Errors and Codes

duckdb-proxy tries to provide developer-friendly error and validation messages. Every error response has an integer code field which identifies the error. Every error response also has a err field which is a description of the error, in English.

For example, a request to an invalid route would return a response with a 404 status code, as well as body with a code and err field:

$ curl http://127.0.0.1:8012/hello

{"code": 2, "err": "not found"}

Validation errors follow the same pattern, with each validation type having a distinct code. Validation errors always have a code: 11 and a validation field that contains a list of errors. Each error contains a field, code, err and optional data.

For example, the error when a required field is missing looks like:

$ curl "http://localhost:8012/api/1/exec" -d '{}'

{
  "code": 11,
  "err": "validation error",
  "validation": [
    {"field": "sql", "code": 1, "err": "is required"}
  ]
}

The field value is the full path name. For arrays, the element index is part of the field:

$ curl "http://localhost:8012/api/1/exec" -d '{
    "sql": "select $1::int",
    "params": ["hi"]
}'

{
  "code": 11,
  "err": "validation error",
  "validation": [
    {"field": "params.0", "code":4, "err": "must be an int"}
  ]
}

Validation codes 100 indicates that there was a problem preparing or executing the SQL statement. The err field contains the error provided by DuckDB:

$ curl "http://localhost:8012/api/1/exec" -d '{"sql": "select * from x"}'

{
  "code": 11,
  "err": "validation error",
  "validation": [
    {"field": "sql", "code": 100, "err": "Catalog Error: Table with ..."}
  ]
}