Several suggestions on performance tuning:
Try running your Lua scripts with luajit's -jv option to see what is aborting JIT compilation. Work-around the NYI items shown in the output.
Try running your Lua scripts with the builtin low-overhead profiler, for example, with the -jp=vf20 option. See http://repo.or.cz/w/luajit-2.0.git/blob_plain/refs/heads/v2.1:/doc/ext_profiler.html 1 for details. (Ensure your Lua scripts run long enough like seconds because the profiler is based on sampling).
Try generating a C-land on-CPU flame graph for the luajit process running your scripts: https://github.com/agentzh/nginx-systemtap-toolkit#sample-bt This graph can show CPU time distribution on a lower level, the VM level, than LuaJIT's builtin profiler (which works on the Lua code level).
Best regards, -agentzh