Data sharing in OpenResty

Throughout the course of writing Lua code that runs inside OpenResty, you will inevitably encounter situations that requires data sharing between Lua codes, NGINX workers. You may also encounter necessities to share data between Lua and C codes. In this post I would like to summarize and compare common ways to share data inside OpenResty and discuss when you should use which method.

Share using ngx.ctx

ngx.ctx is designed for sharing data between different phases within the same request. It is very fast, and can store arbitrary Lua objects inside.

There is really no magic to it. ngx.ctx is simply an ordinary Lua table that ngx_lua creates for you lazily the first time you attempt to access it. It then associates the table with the underlying request object and accessing ngx.ctx through later phase will simply return the same table.

Since ngx.ctx is always associated with the underlying request object, it will be destroyed once the request ends. That being said, timers created using the ngx.timer.* APIs will be running inside completely different request context and will not inherit the ngx.ctx from the request that created them. ngx.thread.spawn, however creates new coroutines that runs within the same request context that created them, so they will share the same ngx.ctx table as well.

A big warning:

Do not write code like this:

-- inside foo.lua
local ctx = ngx.ctx -- WRONG!! never cache ngx.ctx at module level

function handle_request()
    -- save something inside ctx
end

Instead, cache the ngx namespace and access ngx.ctx on demand or cache them as local function variables only.

This is fine:

-- inside foo.lua
local ngx = ngx

function handle_request()
    local ctx = ngx.ctx
    -- save something inside ctx
end

Summary

Use ngx.ctx to share data across different phases of the same request.

Share using NGINX variables

A very obvious way to share data is to use NGINX variables. In fact, it is considered as the only official way to share data between C modules as part of NGINX's design.

OpenResty has pretty good support for NGINX variables in general, but variable access tends to be a lot slower (due to required hash table lookup and memory allocations) than using ngx.ctx and should really only be used if no other alternative method exists.

Also, you usually do not want to write the same variable many times inside the same request using Lua as each of those writes requires new memory allocation on the request's memory pool and those memory can not be released until the request finishes.

Another point worth mentioning is that variables inside NGINX can only be either nil or arbitrary byte strings. Complex Lua types are not supported.

Summary

Use NGINX variables to share data between Lua code and C modules that only has variables support, and between C modules.

Share using module level globals

This method is one of the more interesting ones. It relies on the fact that OpenResty uses the same global lua_State for all requests inside the same worker and module imports are cached within the same global lua_State.

The Data Sharing within an Nginx Worker went through this in great detail and I won't show you how to do it here. But I do want to mention some caveats when using this approach:

  1. lua-nginx-module and stream-lua-nginx-module uses their own separate global lua_State and thus importing the same .lua file inside different subsystems will not cause data to be shared across them.
  2. Be very careful with concurrent requests accessing the same module level globals. Considering the following code:
-- in file module.lua
local _M = {}

local foo = 0

function _M.incr()
    foo = foo + 1
    return foo
end

return _M
location = /test {
    content_by_lua_block {
        local incr = require("module").incr

        ngx.say(incr())
        ngx.sleep(10) -- this will yield
        ngx.say(incr())
    }
}

If only one /test request is running in the worker, the two outputs will be two consecutive numbers. But if you have more than one requests running at the same time, those number may jump by more than 1 because the module level foo variable could be modified when ngx.sleep yields to execute other Lua codes.

Summary

Use module level globals very carefully and especially when using with yielding API calls. When used correct, they are a very efficient way to share data within the same subsystem of the same worker.

Share using shared dictionary

The shared dictionary APIs are provided by OpenResty to allow Lua codes to share data between workers.

Shared dictionary is generally very fast, but requires pre-determined zone size and those can not be changed at runtime.

Shared dictionary do not support arbitrary Lua objects. To store those they will have to be serialized into some kind of native data type shared dictionary supports (e.g. number, string, boolean, etc).

Summary

Use shared dictionary when you have to share data between Lua codes running in different NGINX workers.