http://zorba.io/modules/json-csv

View as XML or JSON.

This module provides an API for parsing and serializing CSV (comma-separated values) files. See RFC 4180, "Common Format and MIME Type for Comma-Separated Values (CSV) Files."

Function Summary

parse ($csv as string) as object()*

Parses a CSV (comma-separated values) string using the default options.

parse ($csv as string, $options as object()) as object()* external

Parses a CSV (comma-separated values) string using the given options.

serialize ($obj as object()*) as string*

Serializes a sequence of JSON objects as CSV (comma-separated values) using the default options.

serialize ($obj as object()*, $options as object()) as string* external

Serializes a sequence of JSON objects as CSV (comma-separated values) using the given options.

Functions

parse#1

declare  function csv:parse($csv as string) as object()*
Parses a CSV (comma-separated values) string using the default options. A newline (U+000A), optionally preceeded by a carriage-return (U+000D), terminates lines, aka, "records."

Quoted values are always considered strings; unquoted values are attempted to be cast to other types, e.g., integer (unless the cast-unquoted-values option is false). Casting is attempted in the following order: integer, decimal, double, and boolean. If casting fails, the value is considered a string. Header field names are always considered strings even if unquoted.

In addition to the "normal" values of true and false for boolean, T and Y are also considered "true" and F and N are also considered "false."

The default options are:

cast-unquoted-values
Whether to attempt to cast unquoted values to integer, decimal, double, or boolean; default: true.
extra-name
The field name for extra values, if any; default: none (error csv:EXTRA_VALUE is raised).
field-names
A JSON array of strings denoting field names; default: none. The first CSV line is assumed to be a header line and the field names are taken from this line.
missing-value
What should happen when a missing value is detected; default: "null". A "missing" value is one of:
  • Two consecutive quote-char characters.
  • A quote-char character as either the first or last character on a line.
  • Fewer values than the number of field names.
When a missing value is detected, the value is set to null.
quote-char
The single ASCII character that may be used to quote values; default: " (U+0022).
quote-escape
The single ASCII character used to escape quote-char; default: same as quote-char. This means that an escaped quote is doubled as "".
separator
The single ASCII character used to separate values; default: , (U+002C).

Parameters

csv as string
The CSV string to parse.

Returns

object()*
a sequence of zero or more JSON objects where each key is a field name and each value is a parsed value.

parse#2

declare  function csv:parse($csv as string, $options as object()) as object()* external
Parses a CSV (comma-separated values) string using the given options. A newline (U+000A), optionally preceeded by a carriage-return (U+000D), terminates lines, aka, "records."

Quoted values are always considered strings; unquoted values are attempted to be cast to other types, e.g., integer (unless the cast-unquoted-values option is false). Casting is attempted in the following order: integer, decimal, double, and boolean. If casting fails, the value is considered a string. Header field names are always considered strings even if unquoted.

In addition to the "normal" values of true and false for boolean, T and Y are also considered "true" and F and N are also considered "false."

Parameters

csv as string
The CSV string to parse.
options as object()
The options to use:
cast-unquoted-values
Whether to attempt to cast unquoted values to integer, decimal, double, or boolean; default: true.
extra-name
The field name for extra values, if any; default: none (error csv:EXTRA_VALUE is raised). If this option is given and a line contains one or more extra values (that is, values that have no corresponding field names), then the extra values are assigned as the values for fields having extra-name as their names.

If extra-name contains a # (U+0023), then the # is substituted with the field number (where field numbers start at 1). If extra-name does not contains a #, then the field number is appended.

field-names
A JSON array of strings denoting field names; default: none. If this option is given, then the first CSV line is assumed not to be a header line; if omitted, then the first CSV line is assumed to be a header line and the field names are taken from this line.
missing-value
What should happen when a missing value is detected; default: "null". A "missing" value is one of:
  • Two consecutive separator characters.
  • A separator character as either the first or last character on a line.
  • Fewer values than the number of field names.
When a missing value is detected, the value of this option determines what happens:
"error"
Error csv:MISSING_VALUE is raised.
"omit"
Both the value and its key are omitted from the result object.
"null"
The value is set to null.
quote-char
The single ASCII character that may be used to quote values; default: " (U+0022).
quote-escape
The single ASCII character used to escape quote-char; default: same as quote-char. If quote-escape equals quote-char, it means that quote-char must be doubled to escape it. If quote-escape does not equal quote-char, it means that quote-escape is used to escape quote-char. For example, a quote-char of " (U+0022) and a quote-escape of \ (U+005C) means that quotes will be escaped by \".
separator
The single ASCII character used to separate values; default: , (U+002C).

Returns

object()*
a sequence of zero or more JSON objects where each key is a field name and each value is a parsed value.

serialize#1

declare  function csv:serialize($obj as object()*) as string*
Serializes a sequence of JSON objects as CSV (comma-separated values) using the default options. The default options are:
field-names
A JSON array of strings denoting field names; default: none. The field names are taken from the first JSON object and the order of the fields is implementation dependent.
serialize-boolean-as
What strings to serialize true and false as; default: true and false.
serialize-header
Whether a header line is included; default: true. The first string result is the header line comprised of all the objects' keys' names.
serialize-null-as
What string to serialize JSON null values as; default: null.
quote-char
The single ASCII character that may be used to quote values; default: " (U+0022).
quote-escape
The single ASCII character used to escape quote-char; default: same as quote-char. This means that quote-char is doubled to escape it.
separator
The single ASCII character used to separate values; default: , (U+002C).

Parameters

obj as object()
The sequence of JSON objects to serialize.

Returns

string*
a sequence of strings where each string corresponds to a JSON object, aka, "record."

serialize#2

declare  function csv:serialize($obj as object()*, $options as object()) as string* external
Serializes a sequence of JSON objects as CSV (comma-separated values) using the given options.

Parameters

obj as object()
The sequence of JSON objects to serialize.
options as object()
The options to use:
field-names
A JSON array of strings denoting field names; default: none. If this option is not set, the field names are taken from the first JSON object and the order of the fields is implementation dependent. If this option is set, the fields are serielized in the order they are in the array. In either case, every JSON object must have the same keys as the first object.
serialize-boolean-as
What strings to serialize true and false as; default: true and false. This must be a sub-object with the two keys "true" and "false", e.g.: { "true" : "Y", "false" : "N" }.
serialize-header
Whether a header line is included; default: true. If true, the first string result is the header line comprised of all the objects' keys' names; if false, the heder line is not returned.
serialize-null-as
What string to serialize JSON null values as; default: null.
quote-char
The single ASCII character that may be used to quote values; default: " (U+0022).
quote-escape
The single ASCII character used to escape quote-char; default: same as quote-char. If quote-escape equals quote-char, it means that quote-char must be doubled to escape it. If quote-escape does not equal quote-char, it means that quote-escape is used to escape quote-char. For example, a quote-char of " (U+0022) and a quote-escape of \ (U+005C) means that quotes will be escaped by \".
separator
The single ASCII character used to separate values; default: , (U+002C).

Returns

string*
a sequence of strings where each string corresponds to a JSON object, aka, "record."