How Narwhal Works
This document provides information on how to use bin/narwhal through its command line options, environment variables, and configuration files, then descends into the exact maddenning details of how it goes about bootstrapping and configuring itself.
Glossary
-
module: a JavaScript file that gets its own local scope and certain free variables so that it may export and import APIs.
-
library: a directory that contains additional top-level modules.
-
package: a downloadable and installable component that may include a library of additional modules, as well as executables, source code, or other resources.
-
sandbox: a system of module instances. sandboxes are not necessarily secure in our parlance, but are the finest security boundary Narwhal can support. All modules in a sandbox are mutually vulnerable to each other and to their containing sandbox. By injecting frozen modules into a sandbox, or through dependency injection using the
systemvariable, it will be eventually possible to construct secure sandboxes. In a secure sandbox, monkey patching globals will not be possible, and strict mode will be enforced. However, all secure sandboxes will be able to share the same primordial objects, particularly Array, so managed communication among sandboxes will be possible. -
sea: a sea for Narwhal is like a virtual environment. for simplicity, the directory schema of a package, a sea, and Narwhal itself are all the same. They all have their own configuration and libraries, but Narwhal always starts searching for packages and modules in the current sea before searching for packages and modules in the main Narwhal installation, or system Narwhal installation.
Command Line Options
-
-e -c --command COMMANDevaluate command (final option)
-
-r --require MODULEpre-load a module
-
-m --module MAINrun a module as a script (final option)
-
-I --include LIBadd a library path to loader in the position of highest precedence
-
-p --package PACKAGEPREFIXESadd a package prefix directory
-
-d --debugset debug mode, system.debug = true
-
-P --no-packagesdo not load packages automatically
-
-v --verboseverbose mode: trace ‘require’ calls.
-
-l --log LEVELset the log level (critical, error, warn, info, debug)
-
-: --path DELIMITERprints an augmented PATH with all package bins/
-
-V --versionprint Narwhal version number and exit.
Environment Variables
-
NARWHAL_DEFAULT_ENGINEmay be set innarwhal.confto a engine name likerhino,v8, orxulrunner. Usetusk enginesfor a complete list and consult theREADMEin that engine directory for details about its function and readiness for use. -
NARWHAL_ENGINEmay be set at the command line, but is otherwise set toNARWHAL_DEFAULT_ENGINEbybin/narwhaland exposed in JavaScript assystem.engine. This is the name of the JavaScript engine in use. -
NARWHAL_HOMEis the path to thenarwhaldirectory and is available in JavaScript assystem.prefix. -
NARWHAL_ENGINE_HOMEis the path to the narwhal engine directory, wherebootstrap.jsmay be found, and is set bybin/narwhal. -
NARWHAL_PATHandJS_PATHcan be used to add high priority library directories to the module search path. These values are accessible in most sandboxes as therequire.loader.pathsvariable, and may be editable in place with methods likeshift,unshift, andsplice. Replacingrequire.loader.pathswith a new Array may not have any effect. In secure sandboxes,pathsare not available. -
NARWHAL_DEBUGis an informational variable that can also be set with the-dand--debugcommand line options, and accessed or changed from within a JavaScript module assystem.debug.NARWHAL_DEBUGgets coerced to aNumber, and the options stack, sojs -ddd -e 'print(system.debug)'will print 3. -
NARWHAL_VERBOSEinstructs the module loader to report when modules have started and finished loading. This environment variable must be used to catalog modules that are loaded in the bootstrapping process. Otherwise, you can use the-vand--verboseoptions for the same effect for modules that are loaded after the command line arguments have been parsed, which happens before packages are loaded. -
NARWHAL_DEBUGGERstarts Narwhal with a debugger GUI if one is available for the engine. For the Rhino-Java engine, this activates the Rhino Java AWT-based debugger. -
SEAis an environment variable set byseathat notifiesnarwhalto search the given virtual environment for packages first. This function can be approximated by using the-por--packageoptions to thenarwhalorjscommand, and is inspectable from within a module as the variablesystem.packagePrefixes[0]. -
SEALVL(sea level) is an informational environment variable provided by theseacommand, analogous toSHLVL(shell level) that is the number of instances ofseathe present shell is running in. -
NARWHAL_JS_VERSIONrefers to the JavaScript version, that defaults to"170"for “1.7.0”, and is used by Rhino on Java to determine the valid JavaScript syntax.
Configuration Files
-
narwhal.confmay be provided to configure site-specific or virtual-environment (sea) specific environment variables likeNARWHAL_DEFAULT_ENGINE. You can also opt to specifyNARWHAL_ENGINE, but that obviates the possibility of allowing the user to override the narwhal engine at the command line.narwhal.conffollows the BSD convention of using shell scripts as configuration files, so you may use anybashsyntax in this file. Anarwhal.conf.templateexists for illustration. -
package.jsondescribes the Narwhal package. Narwhal itself is laid out as a package, so it might be used as a standard library package for other engines that might host module systems independently.package.jsonnames the package, its metadata, and its dependencies.package.jsonshould not be edited. -
local.jsonmay be created to override the values provided inpackage.jsonfor site-specific configurations. Alocal.json.templateexists to illustrate how this might be used to tell Narwhal that the parent directory contains packages, as this is a common development scenario. -
sources.jsoncontains data for Tusk on where to findpackage.jsonfiles andpackage.ziparchives so that it can create a catalog of all installable packages, their descriptions, and dependencies. This file should not be edited unless the intention is to update the defaults provided for everyone. -
.tusk/sources.jsonmay be created for site-specific package sources and overrides the normalsources.json. -
catalog.jsonis meant to be maintained as a centrally managed catalog that may be downloaded from Github to.tusk/catalog.jsonusingtusk update. -
.tusk/catalog.jsonis wheretusklooks for information about packages that can be downloaded and installed. It may be downloaded withtusk updateor built fromsources.jsonor.tusk/sources.jsonusingtusk create-catalog.
Bootstrapping Narwhal
Narwhal launches in stages. On UNIX-like systems, Narwhal starts with a bash script, an engine specific bash script, an engine specific JavaScript, then the common JavaScript.
-
bin/narwhalAt this stage, Narwhal uses only environment variables for configuration. This script discovers its own location on the file system and sources
narwhal.confas a shell script to load any system-level configuration variables likeNARWHAL_DEFAULT_ENGINE. From there, it discerns and exports theNARWHAL_ENGINEandNARWHAL_ENGINE_HOMEenvironment variables. It then executes the engine-specific script,$NARWHAL_ENGINE_HOME/bin/narwhal-$NARWHAL_ENGINE. -
engines/{engine}/bin/narwhal-{engine}This
bashscript performs some engine-specific configuration, like augmenting the JavaCLASSPATHfor the Rhino engine, and executes the engine-specific bootstrap JavaScript using the JavaScript engine for the engine.Some engines, like
k7require the JavaScript engine to be on thePATH. The Rhino engine just expects Java to be on thePATH, and uses thejs.jarincluded in the repository. -
engines/{engine}/bootstrap.jsThis engine-specific JavaScript uses whatever minimal mechanisms the JavaScript engine provides for reading files and environment variables to read and evaluate
narwhal.js.narwhal.jsevaluates to a function expression that accepts a zygoticsystemObject, to be replaced later by loading thesystemmodule proper.bootstrap.jsprovides asystemobject withglobal,evalGlobal,engine, aenginesArray,print,fs.read,fs.isFile,prefix,packagePrefixes, and optionallyevaluate,debug, orverbose.-
globalis theglobalObject. This is passed explicitly in anticipation of times when it will be much harder to grab this object in engines where its name varies (likewindow, orthis) and where it will be unsafe to assume thatthisdefaults toglobalfor functions called anonymously. -
evalGlobalis a function that callsevalin a scope where no global variables are masked by local variables, butvardeclarations are localized. This is passed explicitly in anticipation of situations down the line where it will be harder to callevalin a pristine scope chain. -
engineis a synonym for theNARWHAL_ENGINEenvironment variable, the name of the engine. This variable is informational. -
prefixis a synonym for theNARWHAL_HOMEenvironment variable, the path leading to thenarwhalpackage containingbin/narwhal. -
packagePrefixesis a prioritized Array of all of the package directories to search for packages when that time comes. The first package prefix should be theSEAenvironment variable, if it exists and has a path. This is the first place that thepackagesmodule will look for packages to load. The last package prefix is simply theprefix,NARWHAL_HOME. TheSEAprefix appears first so that virtual environments can load their own package versions. -
enginesis an Array of engine names, used to extend the module search path at various stages to include engine specific libraries. There will usually be more than one engine in this Array. For Rhino, it is['rhino', 'default']. Thedefaultengine contains many “catch-all” modules that, while being engine-specific, are also general enough to be shared among almost all engines. Other engines are likely to share dynamically linked C modules in a “c” engine, and the “rhino” engine itself is useful for the “helma” engine. -
printis a temporary shortcut for writing a line to a logging console or standard output, favoring the latter if it is available. -
fsis a pimitive duck-type of thefilemodule, which will be loaded later. The module loader usesreadandisFileto load the initial modules. -
evaluateis a module evaluator. If the engine does not provide an evaluator, thesandboxmodule has a suitable default, but some engines provide their own. For example, the “secure” engine injects a safe, hermetic evaluator.evaluateaccepts a module as a String, and optionally a file name and line number for debugging purposes.evaluatereturns a module factoryFunctionthat acceptsrequire,exports,module,system, andprint, the module-specific free variables for getting the exported APIs of other modules, providing their own exports, reading their meta data, and conveniently accessing thesystemmodule andprintfunction respectively. -
debugis informational, may be used anywhere, and is read from theNARWHAL_DEBUGenvironment variable, and may be set later by the-dor--debugcommand options. -
verboseinstructs the module loader to log when module start and finish loading, and is read from theNARWHAL_VERBOSEenvironment variable, and may be set later by the-vor--verbosecommand options. To log the coming and going of modules as they occur before the packages and program modules get loaded, you must use the environment variable.
-
-
narwhal.jsThis is the common script that creates a module loader, makes the global scope consistent across engines, finishes the
systemmodule, parses command line arguments, loads packages, executes the desired program, and finally calls the unload event for cleanup or running a daemon event loop.
When Narwhal is embedded, the recommended practice is to load the bootstrap.js engine script directly, skipping the shell script phases.
Some engines, like Helma or GPSEE, may provide their own module loader implementation. In that case, they may bypass all of this bootstrapping business and simply include Narwhal as if it were a mere package.
No system has been constructed for Windows systems yet.
Narwhal Script
The narwhal.js script is the next layer of blubber.
sandboxmodule (loaded manually fromlib/sandbox.js), provides the means to construct arequirefunction so all other modules can be loaded.globalmodule, monkey patches the transitive globals so that every engine receives the same ServerJS and EcmaScript 5 global object, or as near to that as possible.systemmodule, including thefileandloggermodules, which is provided for convenience as a free variable in all modules.narwhalmodule parses arguments.packagesmodule loads packages. *packages-engineloads jars for Java/Rhino.- run command
unloadmodule sends anunloadsignal to any observers, usually for cleanup or to kick off event loops.
Sandbox Module
The sandbox module provides a basic module Loader for module files on disk, a MultiLoader for plugable module factory loaders (for things like Objective-J modules and dynamically linked C modules), a Sandbox for creating and memoizing module instances from the module factories. The sandbox module is useful for creating new sandboxes from within the main sandbox, which is useful for creating cheap module system reloaders that will instantiate fresh modules but only go to disk when the underlying module text has changed.
Global Module
The global module is engine-specific, and there is sharable version in the default engine. The purpose of the global module is to load modules like “json”, “string”, “array”, and “binary”, that monkey patch the globals if necessary to bring every engine up to speed with EcmaScript 5 and the ServerJS standard.
System Module
The system module provides the ServerJS System module standard, for standard IO streams, arguments, and environment variables. The system module goes beyond spec by being a free variable available in all modules, and by providing print, fs, and log variables (at the time of this writing). print is a late-bound alias for system.stdout.print, which is to say that replacing system.stdout will cause print to redirect to the new output stream. fs is an alias for the file module, while log is a Logger instance from the logger module that prints time-stamped log messages to system.stderr.
Narwhal Module
The Narwhal module contains the command line parser declarations for Narwhal, and an Easter egg.
Packages Module
The packages module analyzes and installs packages, such that their libraries are available in the module search path, and also installs some engine-specific package components like Java archives at run-time. The package loader uses a five pass algorithm:
- find and read package.json for every accessible package, collating them into a catalog. This involves a breadth first topological search of the
packages/directory of eachpackagein thesystem.packagePrefixesArray. This guarantees that the packages installed in the Sea (virtual environment) can override the versions installed with the system. - verify that the catalog is internally consistent, dropping any package that depends on another package that is not installed.
- sort the libraries from packages so that libraries that “depend” on other packages get higher precedence in the module search path.
- “analyze” the packages in order. This involves finding the library directories in each package, including engine-specific libraries for all of the
system.engines, and performing engine-specific analysis like finding the Java archives (jars) installed in each package. - “synthesize” a configuration from the analysis. This involves setting the module search path, and performing engine-specific synthesis, like installing a Java class loader for the Java archives, and creating a new, global
Packagesobject.
Much of the weight of code in the packages module concerns using both the conventional locations for libraries and whatnot, but also handling overriden configuration values, gracefully accepting both single Strings and Arrays of multiple options for all directories. For example, packages assumes that each package has a lib directory. However, the package may provide a package.json that states that lib has been put somewhere else, like {"lib": "lib/js"}, or even multiple locations like {"lib": ["lib/js", "usr/lib/js"]}. This applies to “packages” and “jars” as well.
Unload Module
When the program is finished, Narwhal checks whether the “unload” module has been used. If so, it calls the “send” function exported by that module, so that any observers attached with the “when” method get called in first on first off order. This is handy for modules like “reactor” that initiate an event loop.