Co-authored by Carl Sverre from SingleStore
What is a User-Defined Function?
A User-Defined Function (UDF) encapsulates business logic in such a way that it can be safely run within another service's infrastructure. Historically, general purpose UDFs have been difficult to do due to the lack of performant sandboxing solutions. Previous attempts at building UDFs include:
Writing code in a custom domain specific programming language like PL/pgSQL
Sending events to webhooks (HTTP)
Providing an entire Virtual Machine IaaS just to safely run plugins
All of these solutions to UDFs have flaws, but until recently these were your choices. Not anymore! With server-side WebAssembly (Wasm), we now have a performant and secure sandbox solution that allows us to run untrusted third-party UDFs directly in our applications. Even better, customers are able to compile code from tons of different languages!
There is a growing ecosystem of services, databases, and libraries which can run Wasm UDFs. Some examples are:
In this article, we are going to focus on the benefits of using Wasm based UDFs to extend your database or SaaS application.
How does a UDF work in a database?
The concept of extending databases with UDFs is not new. Databases have long supported user defined code written in custom languages and compiled by the database itself. For example, in PostgreSQL the following operation would define a function called increment:
CREATE OR REPLACE FUNCTION increment(i integer) RETURNS integer AS $$
BEGIN
RETURN i + 1;
END;
$$ LANGUAGE plpgsql;
Once you define this function you could call it like any other database builtin. For example:
postgres=# select increment(5);
increment
-----------
6
(1 row)
In order to do this, PostgreSQL compiles the function and stores it using its own custom bytecode representation. When you later reference it in an operation or query, PostgreSQL runs the function during query execution as if it was any other builtin operation.
Unfortunately, this approach has some drawbacks. The first is that the database needs to support the language you want to use. Continuing with the same example, while PostgreSQL can be extended by many different languages, it requires system admin access to add support for each one. The second issue is sandboxing. While custom languages like PL/pgSQL can be reasonably trusted, more sophisticated languages like Python are much more difficult to secure.
To address these drawbacks, databases have started to support Wasm powered UDFs. Because users may compile Wasm modules from many different source languages, it's no longer necessary for the database to provide specific language support. In addition, as Wasm is naturally sandboxed due to its capability security model, user provided untrusted code is safe to run. Finally, Wasm was built with performance in mind allowing for very low overhead in comparison to other extensibility runtimes.
How does a UDF work in a SaaS application?
A capability that’s become available in recent years is embedding UDFs within applications, specifically SaaS (Software as a Service) apps. These differ from database UDFs in that custom functions are added to the business logic of an application rather than embedded in a database query. For example, if you have a service that notifies you in Slack when a new sale is made on your Ecommerce store, you could add a UDF that changes the behaviour of that notification:
type SaleNotification struct{}
type Sale struct {
Customer string
Amount float64
}
type Notification struct {
Channel string
}
func (h SaleNotification) Run(sale Sale) (Notification, error) {
if sale.Amount > 10000.00 {
return Notification{
Channel: "enterprise",
}
}
return Notification{
Channel: "retail",
}
}
In this example, the notification is sent to the enterprise
channel if the sale amount is more than $10,000, otherwise it’s sent to the retail
channel. This UDF would be triggered by the server-side logic of the notification service, and the output of the UDF would be used to modify how the notification is sent to your Slack account. By allowing arbitrary logic, the application is made much more flexible than something configured using a UI.
The other major difference between database UDFs and SaaS UDFs is that the former would generally be written by an application developer to make their database queries more efficient or easy to understand, whereas the latter would be written by the end-user of an application. This is important to note, as the capabilities given to these UDFs will vary significantly based on the trust given to the author. I trust myself (perhaps foolishly) to write code that is safe to run on my own database, but I wouldn’t give the same amount of trust to a user of my application that I’ve never met. The WebAssembly sandbox plays the same role here by ensuring that any UDF (whether running in the database or within my SaaS app) is only given access to exactly the resources that the application developer intends, and nothing more.
Why use one or the other?
When reaching for a UDF system as a tool to solve a problem, ask yourself this: “am I modifying my own system’s behaviour, or is someone else modifying my system?”. If the answer is that you want to modify the behaviour of your own system, a database UDF may be a good option. If you want to grant someone else the ability to modify how your software behaves, then a SaaS UDF is the way to go.
So what’s next?
If the capabilities of UDFs seem intriguing to you, we suggest you give them a try. SingleStore is a database that provides Wasm UDFs as a built-in feature, making it incredibly easy to experiment with the power of database UDFs. Suborbital similarly provides everything you need to add UDFs to your SaaS application, and can be set up in just a few minutes. We’d love to hear about your use-cases for UDFs, and to hear how you’re using them to make your software more flexible. You can tweet at Connor and Carl to show off what you’ve done!
Resources