A HTML to PDF/Image conversion service available over a HTTP API and powered by wkhtmltopdf
and wkhtmltoimage
.
Sanaa provides a HTTP API around wkhtmltoimage
and wkhtmltopdf
. There’s been
no attempt to modify those two binaries. It’s BYO-wkhtmltoX.
It translates options passed in as JSON to flags, runs the command, fetches the command-output/generated-file and translates those results into a JSON response. The generated file should have been uploaded to an S3 bucket (by that point) and the API response should contain a signed link to it.
The current implementation assumes that Sanaa’s role is purely to render what you ask it to and provide you with a means to fetch it. This emphasis on a single-responsibility made for a simple design. So, it’s left up to you to use the result as you wish.
That pretty much is it 💪 … in a nutshell! 🥜🐚
Sanaa is a single Go binary. All you need to do is download the binary for your
platform from the releases page to any location in your $PATH
and
you’re good to go.
If using Docker 🐳, there’s the kingori/sanaa
image on Docker Hub.
Find docker-compose and kubernetes examples in the
examples/
folder.
Just make sure that wkhtmltoimage
and wkhtmltopdf
binaries are available in
your $PATH
for sanaa to be able to autodetect them. Fetch downloads from
here.
Most configuration is done via flags, see sanaa --help
. But there’s AWS
specific configuration, which are mostly secrets, that don’t seem appropriate to
set via flags.
For example, Sanaa requires AWS credentials with permissions. The worker requires upload access to the S3 bucket it will use to store the results of rendering and the server will require access to generate signed URLs to download from the same bucket.
These credentials will be sourced automatically from the following locations (in order of priority, first at the top):
Environment Credentials - via environment variables:
export AWS_ACCESS_KEY_ID=SOME_KEY
export AWS_SECRET_ACCESS_KEY=SOME_SECRET
export AWS_REGION=us-east-1
Shared Credentials file - via ~/.aws/credentials
:
[default]
aws_access_key_id = <SOME_KEY>
aws_secret_access_key = <SOME_SECRET>
aws_region = us-east-1
EC2 Instance Role Credentials - assigns credentials to application if it’s running on an EC2 instance that’s been given an EC2 Instance Role. This removes the need to manage credential files in production.
Start the server (that will receive requests):
$ sanaa server --verbose
INFO[0000] starting the server
INFO[0000] request TTL set to 86400 seconds
INFO[0000] listening on http://0.0.0.0:8080
Start the worker (that will process requests):
$ sanaa worker --s3-bucket=example-bucket-name --verbose
INFO[0000] starting the worker
INFO[0000] using wkhtmltoimage 0.12.4 (with patched qt)
INFO[0001] using wkhtmltopdf 0.12.4 (with patched qt)
INFO[0001] concurrency set to 2
INFO[0001] maximum retries set to 1
INFO[0001] registering 'convert' queue
INFO[0001] waiting to pick up jobs placed on any registered queue
For images (see reference), make a POST
request to
/render/image
:
POST /render/image HTTP/1.1
Content-Type: application/json
Host: 127.0.0.1:8080
Connection: close
Content-Length: 172
{
"target": {
"format": "png",
"height": 1080,
"width": 1920
},
"source": {
"url": "https://en.wikipedia.org/wiki/Kenya"
}
}
For PDFs (see reference), make a POST
request to /render/pdf
:
POST /render/pdf HTTP/1.1
Content-Type: application/json
Host: 127.0.0.1:8080
Connection: close
Content-Length: 127
{
"target": {
"page_size": "A4"
},
"source": {
"url": "https://en.wikipedia.org/wiki/Kenya"
}
}
If a render request was successful, expect a 201 Created
HTTP response
indicating that the server has acknowledged the request:
HTTP/1.1 201 Created
Content-Type: application/json
Date: Tue, 06 Feb 2018 05:19:09 GMT
Content-Length: 176
Connection: close
{
"uuid": "640882bd-9441-48fb-8686-27286f399004",
"created_at": "2018-02-06T05:19:09Z",
"started_at": "",
"ended_at": "",
"expires_in": 86400,
"file_url": "",
"status": "pending",
"logs": [
""
]
}
In case of failure, expect an appropriate response as well. For example:
400 Bad Request
- if unable to unmarshall the request JSON, or if you’ve
requested for a render type apart from the supported types i.e. image
or
pdf
.500 Internal Server Error
- if unable to enqueue the job for the workers to
pick up e.g. if redis is down.Each render request that has been enqueued is assigned a UUID (found in uuid
attribute of the response to a render request). Pass the UUID to the
/status/{uuid}
endpoint via GET
to get an update on the status of the
conversion job:
GET /status/4c815816-1bfe-4790-b8d1-ee06c98b7d6d HTTP/1.1
Content-Type: application/json
Host: 127.0.0.1:8080
Connection: close
The status endpoint will return a 200 OK
HTTP response with details of the
conversion job. An example of one that’s succeeded:
HTTP/1.1 200 OK
Content-Type: application/json
Date: Sat, 24 Feb 2018 00:41:31 GMT
Connection: close
Transfer-Encoding: chunked
{
"uuid": "21835d4a-5dfc-41a4-a798-21980baa43c9",
"created_at": "2018-02-24T00:40:32Z",
"started_at": "2018-02-24T00:40:36Z",
"ended_at": "2018-02-24T00:40:57Z",
"expires_in": 86400,
"file_url": "https://s3.amazonaws.com/example-bucket-name/21835d4a-5dfc-41a4-a798-21980baa43c9/file.png?signed-url-signature",
"status": "succeeded",
"logs": [
"Loading page (1/2)",
"...",
"Rendering (2/2)",
"...",
"Done"
]
}
Notably, several fields reflect the state of the conversion job:
status
is set to succeeded
,started_at
has been set to the time the processing started,ended_at
has been set to the time the processing ended. It should be empty
if the it’s still in processing state,logs
has been populated with output from the processing.In case of failure, expect to recieve responses that communicate the problem. For example:
404 Not Found
- may happen if you render request has expired (based on TTL)
or if there’s no job found matching the UUID set.400 Bad Request
- if your identifier is not a valid UUID.500 Internal Server Error
- if the server is unable to fulfill your request
i.e. if redis is down.The /render/{type}
and /status/{uuid}
endpoints either return an object
representing an error or a conversion job.
For errors, the response body is simple and self-explanatory. It includes the
uuid
of the request and a message
explaining the error. For example, if you
send a bad JSON body during an image render request, the response would be
something like this:
HTTP/1.1 500 Internal Server Error
Content-Type: application/json
Date: Tue, 06 Feb 2018 07:26:44 GMT
Content-Length: 91
Connection: close
{
"uuid": "536d3847-64b8-497a-8d8a-ac541dfa9c9e",
"message": "Unable to unmarshal json to image type"
}
For render requests, the returned object represents a conversion job which has the following attributes:
Attribute | Description |
---|---|
uuid |
Unique identifier of the request |
created_at |
When the request was initiated |
started_at |
When the request was picked by a worker for processing |
ended_at |
When a worker completed processing the request after picking it up |
expires_in |
How long to persist the request and any of it’s data |
file_url |
URL to fetch the artefact generated by the request after processing |
status |
Status of the job i.e. pending , processing , failed , succeeded |
logs |
Output of processing by the worker, useful when debugging |
Timestamp fields are RFC3339 and always in UTC.
The server component has two health endpoints available:
/health/live
- liveness endpoint, indicates that the server is up./health/ready
- readiness endpoint, indicates that server is ready to
receive requests.Pass the ?full=1
query parameter to expose the details of the check in the
JSON response. These are omitted by default for performance.
Also note that both endpoints return the appropriate response conveying the health of the service. To demonstrate this, make a request to the readiness endpoint:
GET /health/ready?full=1 HTTP/1.1
Host: 127.0.0.1:8080
Connection: close
If redis is up, you should get a 200 OK
HTTP response:
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Thu, 22 Feb 2018 19:54:41 GMT
Content-Length: 37
Connection: close
{
"redis-tcp-connection": "OK"
}
If redis is down, you should get a 503 Service Unavailable
HTTP response:
HTTP/1.1 503 Service Unavailable
Content-Type: application/json; charset=utf-8
Date: Thu, 22 Feb 2018 20:00:27 GMT
Content-Length: 87
Connection: close
{
"redis-tcp-connection": "dial tcp 127.0.0.1:6379: connect: connection refused"
}
For normal usage the above instructions should do. Below instructions are only necessary if you intend to work on the source code (find contributing guidelines here, plan here and the milestones here).
go get -v github.com/itskingori/sanaa
.make tools
.make dependencies
(they’ll be placed in
./vendor
). Requires golang/dep package manager.make build
../bin/sanaa help
as a basic test.make dependencies
.make lint
and test using make test
.The home page is built using Jekyll (a fun and easy to use static site
generator) and it is hosted on GitHub Pages. The code is in the
docs/
folder if you want to have a peek.
git tag
) and push the tags to remote (git push --tags
).x.y.z
will be marked as final releases.x.y.z-*
will be marked as pre-releases.kingori/sanaa:latest
and kingori/sanaa:x.y.z
.What does Sanaa mean?
It’s the Swahili word for “art” or more specifically a “work of beauty”. I’m Kenyan so my bias to Swahili is obvious. 🤷
How Can I Help?
Write tests (or show me how to). I’m fairly new to Go and I’m of the opinion that writing tests for a service like this is non-trivial. So far testing has been manual but I plan to read on it and write some when I get time (as an exercise in continuous learning).
Give feedback. Feel free to submit via raising an issue or even comment on the open issues.
King’ori J. Maina © 2018. The GNU Affero General Public License v3.0 bundled therein, essentially says, if you make a derivative work of this, and distribute it to others under certain circumstances, then you have to provide the source code under this license. And this still applies if you run the modified program on a server and let other users communicate with it there.