Assets

Assets is a proxy server for static assets (e.g. css, js, fonts) as well as image resizing (using vipsthumbnail). It is a Go executable. It is open source, easy to install, reliable, secure and performant.

Usage

At its simplest, an upstream server is configured, and Assets proxies requests to the upstream and caches the results. Given this simple configuration:

{
  "upstreams": {
    "gobl": {
      "base_url": "https://www.goblgobl.com/docs/"
    }
  }
}

Requests made to http://127.0.0.1:5300 (Assets's default listening port), can be proxied to goblgobl.com using the up querystring parameter equal to our configured upstream key (i.e. gobl):

$ curl -s "http://127.0.0.1:5300/v1/css/main.css?up=gobl" | head -n 4
* {
  margin: 0;
  padding: 0;
}

Note that the v1 portion of the request path is part of Assets' own structure and is not part of the upstream request. Thus, the above request (/v1/css/main.css) translates to an upstream request to https://www.goblgobl.com/docs/css/main.css.

While proxying static assets with aggressive caching can be valuable on its own, the main value of Assets is to generate thumbnails of images. The first step is to extend our configuration with one of more transforms definitions:

{
  "upstreams": {
    "gobl": {
      "base_url": "https://www.goblgobl.com/docs/"
      "transforms": {
        "thumb_100x150": ["--size", "100x150", "-m", "attention"]
      }
    }
  }
}

The key of the transform is specified in the xform querystring parameter. The value is an array of parameters which will be passed to vipsthumbnail as-is.

$ curl -s \
  "http://127.0.0.1:5300/v1/favicon.png?up=gobl&xform=thumb_100x150" |
 identify -format '%w %h' -
100 150

Some thumbnail generating proxies allow the client to specify transformation properties (e.g. by passing parameters like width=100&height=150 in the querystring). Such APIs can often be exploited to DOS the server.

Running

Assets is meant to be both easy to run and easy to compile.

You can grab the latest pre-compiled binary, create a simple config.json and run it.

To compile the project, you'll need a recent version of Go (1.18+). Download the source and run:

wget https://github.com/goblgobl/assets/archive/refs/heads/master.tar.gz
tar -xzvf master.tar.gz
cd assets-master
make build

To generate an executable named assets.

Authentication

Assets provides no authentication or authorization. It is intended to sit in front of a public origin and thus be public itself.

Configuration

By default, the configuration will be loaded from the config.json file. This can be changed by specifying the -config PATH argument on startup.

{
"upstreams": {
"gobl": {
"base_url": "https://www.goblgobl.com/docs/"
}
}
}
{
"instance_id": 0,
"vipsthumbnail": "/opt/vips/vipsthumbnail",
"cache_root": "/opts/assets/cache/",

"http": {
"listen": "127.0.0.1:5200"
},

"upstreams": {
"gobl": {
"base_url": "https://www.goblgobl.com/docs/",
"transforms": {
"thumb_100x150": ["--size", "100x150"],
},
"caching": [
{"status": 0, "ttl": 300},
{"status": 404, "ttl": -60},
{"status": 200, "ttl": 3600}
],
"buffers": {
"count": 100,
"min": 131072,
"max": 1048576
}
}
},

"log": {
"level": "info",
"requests": true,
"pool_size": 100,
"format": "kv",
"kv": {
"max_size": 4096
}
}
}

Common Configuration Settings

log.level

The log level to use. Can one one of INFO, WARN, ERROR, FATAL or NONE. Defaults to INFO.

log.requests

HTTP requests are logged regardless of the configured log.level. Set this value to false if you do not want HTTP requests to be logged. Defaults to true, which will log HTTP requests.

http.listen

The address:port that the HTTP should listen on. Defaults to 127.0.0.1:5300.

vipsthumbnail

The full path to the vipsthumbnail executable. When not specified, the program will attempt to find it (either in the current working directory, or using the OS's path lookup convention).

cache_root

The root folder where cached content should be stored. The folder will be created if it does not already exists. Defaults to a folder named cache in the working directory.

upstreams.NAME.base_url

Assets must be configured with at least 1 upstream server, and this configuration must include a base_url field. Given a base_url of https://www.goblgobl.com/docs/, a request made to http://assets.host/v1/main.css would look for resource at https://www.goblgobl.com/docs/main.css. Note that the v1 is stripped out, as this is a Assets versioning.

This setting is required and there is no default.

upstreams.NAME.transforms

Defines supported image transformation arguments. For example, if a "thumbs_100x150": ["--size", "100x150"] is specified, clients can request an image using the xform=thumbs_100x150 query parameter to have a the server generate a 100x150 thumbnail. The string array is passed to vipsthumbnail as-is. While it's tempting to have clients specify their desired transformation arguments, this can lead to DOS attacks.

upstreams.NAME.buffers.max

Defines the maximum size of a non-image resource will be accepted from an upstream. This setting does not pre-allocate memory (unlike min), so it's generally safe to set quite large. As images are saved directly to disk, no internal buffer is used and thus no limit is enforced. Defaults to 5242880 (5MB).

Advanced Configuration Settings

instance_id

This value should only be set when multiple instances of Assets are deployed. It defaults to 0, which is fine when a single instance is used.

Every request is assigned a RequestId, which is included in any logs generated with respect to the request, as well as placed in the RequestId header of the response. While the RequestId isn't guaranteed to be unique, in deployments using multiple instances, giving each instance a unique id (from 0-255) will greatly reduce duplicates.

upstreams.NAME.caching

By default, Assets will locally cache upstream responses (or transformed images) based on the upstream's max-age value of the Cache-Control header. This behavior can be tweaked based on the status code by providing an array of {"status": CODE, "ttl": INT} values. The ttl is in seconds. A negative ttl will ignore the upstream Cache-Control header and cache for positive ttl seconds. A positive ttl will use the Cache-Control value and fallback to the provide ttl if the Cache-Control is missing or invalid. The special status of 0 is used as a fallback. The default is:

{"status": 0, "ttl": 300},
{"status": 404, "ttl": -60},
{"status": 200, "ttl": 3600}

Which means that, 404s are always cached locally for 60 seconds. 200s are cached based on the Cache-Control header. If this header is missing or invalid, 200s are cached for 1 hour. All other responses are cached based on Cache-Control header or 5 minutes if the header is missing or invalid.

upstreams.NAME.buffers.count

Every upstream keeps a pool of buffers for reading and processing responses. The count parameter defines how many buffers each upstream should keep cached. min * count bytes are allocated on startup. This defaults to 10.

upstreams.NAME.buffers.min

Every upstream keeps a pool of buffers for reading and processing responses. The min parameter defines the minimum size of each buffer. This cannot be less than 255. min * count bytes are allocated on startup. This defaults to 131072 (or 128KB).

How and when buffers are used is an implementation detail. However, they are currently mostly used when reading responses from upstream servers for non-images. Thus, given long cache values, they are not critical for high performance.

log.pool_size

Assets pre-allocates a pool of loggers which helps reduce the amount of memory that is created and which must be garbage collected during runtime. The amount of pre-allocated memory depends on the pool_size and the configured maximum log size.

For best performance, at the cost of memory, this should be set to the maximum number of concurrent requests the system will handle. It defaults to 100.

The log pool will not grow or shrink and is non-blocking. If more loggers are requested than the pool can handle, loggers will be dynamically allocated, but will not be added back to the pool.

log.format

The format of the generated log messages. Currently, the only supported value, and the default, is kv for a key=value type log output.

log.kv.max_size

The maximum size of an individual log message. Any additional data will be discarded. Defaults to 4096. With a default pool_size of 100, the total memory pre-allocated for logging is 100 * 4096 bytes (0.4096 megabytes).

Errors and Codes

Assets tries to provide developer-friendly error and validation messages. Every error response has an integer code field which identifies the error. Every error response also has a string error field which is a description of the error, in English. While basic and aimed at guiding developers, the error field will never contain sensitive data and can be shown to end-users (although, again, it's rather basic, might be a little technical, and is always in English).

For example, a request to an invalid route would return a response with a 404 status code, as well as body with a code and error field:

$ curl http://127.0.0.1:5300/invalid

{
  "code": 202001,
  "error": "not found"
}

Error Codes

codedesc
2001

A generic internal server error. This is the least specific and thus least useful error. The response will have an uuid Error-Id header and the same value will be in the error_id field of the response. Assuming ERROR level (or lower) logging is enabled, a log containing the eid=$ID attribute will contain more data. Because this is an unexpected error, including more details in the HTTP response could result in sensitive data being leaked, thus only a referenced to the logged error is provided.

2002

A response could not be serialized to JSON. Like the 2001 error, please see the error_id and corresponding log entry. This error is almost certainly a result of a bug. We we hope that you'll report it.

2003

The request payload was not valid JSON.

2004

The request contained invalid data. See the validation section for details on validation errors.

202001

An http 404 that secifically relates to the URL path being unknown (as opposed to, say, an endpoint returning a 404 because some ID wasn't valid).

202002

up querystring parameter is missing.

202003

up querystring parameter is invalid (does not match a configured upstream name).

202004

xform querystring parameter is invalid (not not match a configured transformation for the specified upstream).

202005

Upstream resource not found. For protection against denial of service (DOS) attacks, Assets will not cache the entire entire response of a 404. Instead it returns this simple error.