RumbleDB parameters

The parameters that can be used on the command line as well as on the planned HTTP server are shown below.

RumbleDB runs in three modes. You can select the mode passing a verb as the first parameter. For example:

   spark-submit run file.jq -o output-dir -P 1
   spark-submit run -q '1+1'
   spark-submit serve -p 8001
   spark-submit repl -c 10

Previous parameters (--shell, --query-path, --server) work in a backward compatible fashion, however we do recommend to start using the new verb-based format.

Shell parameter Shortcut HTTP parameter example values Semantics
--shell repl N/A yes, no yes runs the interactive shell. No executes a query specified with --query-path
--shell-filter N/A N/A jq . Post-processes the output of JSONiq queries on the shell with the specified command (reading the RumbleDB output via stdin)
--query -q query 1+1 A JSONiq query directly provided as a string.
--query-path (any text without -- or - is recognized as a query path) query-path file:///folder/file.jq A JSONiq query file to read from (from any file system, even the Web!).
--output-path -o output-path file:///folder/output Where to output to (if the output is large, it will create a sharded directory, otherwise it will create a file)
--output-format -f N/A json, csv, avro, parquet, or any other format supported by Spark An output format to use for the output. Formats other than json can only be output if the query outputs a highly structured sequence of objects (you can nest your query in an annotate() call to specify a schema if it does not).
--output-format-option:foo N/A N/A bar Options to further specify the output format (example: separator character for CSV, compression format...)
--overwrite -O (meaning --overwrite yes) overwrite yes, no Whether to overwrite to --output-path. No throws an error if the output file/folder exists.
--materialization-cap -c materialization-cap 200 A cap on the maximum number of items to materialize for large sequences within a query or for outputting on screen (used to be called --result-size).
--number-of-output-partitions -P N/A ad hoc How many partitions to create in the output, i.e., the number of files that will be created in the output path directory.
--log-path N/A log-path file:///folder/log.txt Where to output log information
--print-iterator-tree N/A N/A yes, no For debugging purposes, prints out the expression tree and runtime interator tree.
--show-error-info -v (meaning --show-error-info yes) show-error-info yes, no For debugging purposes. If you want to report a bug, you can use this to get the full exception stack. If no, then only a short message is shown in case of error.
--static-typing -t (meaning --static-typing yes) static-typing yes, no Activates static type analysis, which annotates the expression tree with inferred types at compile time and enables more optimizations (experimental). Deactivated by default.
--server serve N/A yes, no yes runs RumbleDB as a server on port 8001. Run queries with http://localhost:8001/jsoniq?query-path=/folder/foo.json
--port -p N/A 8001 (default) Changes the port of the RumbleDB HTTP server to any of your liking
--host -h N/A localhost (default) Changes the host of the RumbleDB HTTP server to any of your liking
--variable:foo N/A variable:foo bar --variable:foo bar initialize the global variable $foo to "bar". The query must contain the corresponding global variable declaration, e.g., "declare variable $foo external;"
--context-item -I context-item bar initializes the global context item $$ to "bar". The query must contain the corresponding global variable declaration, e.g., "declare context item external;"
--context-item-input -i context-item-input - reads the context item value from the standard input
--context-item-input-format N/A context-item-input-format text or json sets the input format to use
for parsing the standard input (as text or as a serialized json value)
--dates-with-timezone N/A dates-with-timezone yes or no activates timezone support for the type xs:date (deactivated by default)