2.2.3. Data Types

2.2.3.1. Address

  • Type: addr
  • Example constants: 192.168.1.1, [2001:db8:85a3:8d3:1319:8a2e:370:7348], [::1]
  • Addresses are passed by value.

The addr type stores IP addresses. It handles IPv4 and IPv6 addresses transparently. Note that IPv6 constants need to be enclosed in brackets. For a given addr instance, family() retrieves the family.

Operators

address == address

Compares two address values, returning True if they are equal.

Methods

(typeX by(type by name:spicy::AddrFamily) family ()

Returns the IP family of an address value.

2.2.3.2. Bool

  • Type: bool
  • Example constants: True, False
  • Booleans are passed by values.

[TODO: Overview]

Operators

bool == bool

Compares two boolean values, returning True if they are equal.

bool && bool

Returns the logical “and” of two booleans.

bool || bool

Returns the logical “or” of two booleans.

! bool

Negates a boolean value.

Methods

None defined.

2.2.3.3. Bytes

[TODO: Overview]

Operators

bytes == bytes

Compares two bytes values.

bytes + bytes

Concatenates two bytes values.

bytes += bytes

Appends a bytes value to another one.

|bytes|

Returns the length of the bytes instance.

Methods

begin()

Returns an iterator pointing to the initial element.

decode(charset: enum {)

Interprets the bytes as representing an binary string encoded with the given character set, and converts it into a UTF8 string

end()

Returns an iterator pointing one beyond the last element.

join(l: list)

Renders the elements of l into textual form and joins them into a single bytes object using the given one as separator.

lower()

Returns a lower-case version.

match(r: regexp, n: [ int ])

Matches the bytes object against the regular expression r. Returns the matching part, or if n is given the corresponding subgroup within r.

split(sep: [ bytes ])

Splits at each occurence of sep, returning a vector of bytes representing each piece excluding the separators. If sep is skipped, the default is to split at any sequence of white-space.

split1(sep: [ bytes ])

Splits at the first occurence of sep, returning a pair of bytes representing everything before and afterwards, respectively. If sep is skipped, the default is to split at any sequence of white-space.

startswith(b: bytes)

Returns true if the bytes objects begins with b.

strip(side: [ enum {  ], chars: [ bytes ])

Strips off leading and/or trailing characters, as indicated by side with either if not given. By default it strips off all whitespace; alternatively any characters contained in chars.

to_int(base: [ uint<64> ])

Interprets the bytes as representing an ASCII-encoded number and converts it into a signed integer, using a base of base. If base is not given, the default is 10.

to_int(byte_order: enum {)

Interprets the bytes as representing an binary number encoded with the given byte order, and converts it into a signed integer.

to_time(base: [ uint<64> ])

Interprets the bytes as representing a number of seconds since the epoch in the form of an ASCII-encoded number and converts it into a time value, using a base of base. If base is not given, the default is 10.

to_time(byte_order: enum {)

Interprets the bytes as representing as number of seconds since the epoch in the form of an binary number encoded with the given byte order, and converts it into a time value.

to_uint(base: [ uint<64> ])

Interprets the bytes as representing an ASCII-encoded number and converts it into an unsigned integer, using a base of base. If base is not given, the default is 10.

to_uint(byte_order: enum {)

Interprets the bytes as representing an binary number encoded with the given byte order, and converts it into an unsigned integer.

upper()

Returns an upper-case version.

2.2.3.4. Double

[TODO: Overview]

Operators

cast<int>(double)

Casts a double into an integer value, truncating any fractional value.

double coerces to bool

Doubles coerce to boolean, returning true if the value is non-zero.

double / double

Divides two doubles.

double == double

Compares to doubles.

double > double

Returns whether the first double is larger than the second.

double < double

Returns whether the first double is smaller than the second.

double - double

Subtracts two doubles.

double mod double

Returns the remainder of a doubles’ division.

double * double

Multiplies two doubles.

double + double

Adds two doubles.

double ** double

Raises a double to a given power.

Methods

None defined.

2.2.3.5. Enum

[TODO: Overview]

Operators

t:type(any)

Converts an integer into an enum.

cast<int>(enum{)

Casts an enum into an integer, returning a value that is consistent and unique among all labels of the enum’s type.

enum{ coerces to bool

Enums coerce to boolean, returning true if the value corresponds to a known label.

enum{ == enum{

Compared two boolean values.

Methods

None defined.

2.2.3.6. Function

[TODO: Overview]

Operators

t:function():void(any)

Calls a function.

Methods

None defined.

2.2.3.7. Integer

[TODO: Overview]

Operators

int & int

Computes the bitwise  and of two integers.

int | int

Computes the bitwise  or of two integers.

int ^ int

Computes the bitwise  xor of two integers.

cast<int>(int)

Casts an integer into a different integer type, extending/truncating as needed.

cast<interval>(int)

Casts an unsigned integer into an interval, interpreting the value as seconds.

cast<time>(int)

Casts an unsigned integer into a time, interpreting the value as seconds since the epoch.

int coerces to bool

Integers coerce to boolean, returning true if the value is non-zero.

int coerces to double

Unsigned integers coerce to doubles.

int coerces to int

Integers coerce to other integer types if their signedness match and their width is larger or equal.

int coerces to interval

Unsigned integers coerce to intervals.

int / int

Divides two integers.

int == int

Compares two integer for equality.

int > int

Returns whether the first integer is larger than the second.

int < int

Returns whether the first integer is smaller than the second.

int - int

Subtracts two integers.

int += int

Decreases an integer by a given amount.

int mod int

Returns the remainder of a integers’ division.

int * int

Multiplies two integers.

int + int

Adds two integers.

int += int

Increases an integer by a given amount.

int ** int

Raises an integer to a given power.

int << int

Shifts an integer left by a given number of bits.

int >> int

Shifts an integer right by a given number of bits.

Methods

None defined.

2.2.3.8. Interval

[TODO: Overview]

Operators

cast<double>(interval)

Casts a interval into a double.

cast<int>(interval)

Casts a interval into an integer value, truncating any fractional value.

interval coerces to bool

Intervals coerce to boolean, returning true if the value is non-zero.

interval == interval

Compares to intervals.

interval > interval

Returns whether the first interval is larger than the second.

interval < interval

Returns whether the first interval is smaller than the second.

interval - interval

Subtracts two intervals.

interval * int

Multiplies an interval with an integer.

int * interval

Multiplies an integer with a interval.

interval + interval

Adds two intervals.

Methods

nsecs()

Returns the interval as nanoseconds.

2.2.3.9. Iterator

[TODO: Overview]

Operators

*iterator

Returns the element referenced by the iterator.

iterator == iterator

Compares two iterators.

iterator++

Advances the iterator by one element, returning the previous iterator.

--iterator

Advances the iterator by one element, returning the new iterator.

iterator + int

Returns an iterator advanced by a given number of elements.

iterator += int

Advances the iterator by a given number of elements.

Methods

None defined.

2.2.3.10. List

[TODO: Overview]

Operators

list += list

Appends a lsit value to another one.

|list|

Returns the length of the list.

Methods

push_back(elem: any)

Appends an element to the list.

2.2.3.11. Map

[TODO: Overview]

Operators

deletemap<*,*>[any]

Deletes an element from the map. If the element does not exist, there’s no effect.

any in map<*,*>

Returns true if there’s a map element with the given index.

map<*,*>[any]

Returns the map element at the given index.

map<*,*>[any]=any

Assigns an element to the given index of the map. Any already existing element will be overwritten.

|map<*,*>|

Returns the number of elements in the map.

Methods

clear()

Removes all elements from the map.

get(index: any, default: [ any ])

Returns the map element at the given index. If the element does not exist, default is returned if given.

2.2.3.12. Set

[TODO: Overview]

Operators

addset[any]

Adds an element to the set.

deleteset[any]

Deletes an element from the set. If the element does not exist, there’s no effect.

any in set

Returns true if the element is a member of the set.

|set|

Returns the number of elements in the set.

Methods

clear()

Removes all elements from the set.

2.2.3.13. Sink

[TODO: Overview]

Operators

newsink

Instantiates a new sink.

|sink|

Returns the number of bytes written into the sink so far. If the sink has filters attached, this returns the value after filtering.

Methods

add_filter(t: enum {)

Adds an input filter as specificed by t (of type ~~Spicy::Filter) to the sink. The filter will receive all input written into the sink first, transform it according to its semantics, and then parser attached to the unit will parse the output of the filter. Multiple filters can be added to a sink, in which case they will be chained into a pipeline and the data is passed through them in the order they have been added. The parsing will then be carried out on the output of the last filter in the chain. Note that filters must be added before the first data chunk is written into the sink. If data has already been written when a filter is added, behaviour is undefined. Currently, only a set of predefined filters can be used; see ~~Spicy::Filter. One cannot define own filters in Spicy. Todo: We should probably either enables adding filters laters, or catch the case of adding them too late at run-time an abort with an exception.

close()

Closes a sink by disconnecting all parsing units. Afterwards, the sink’s state is as if it had just been created (so new units can be connected). Note that a sink it automatically closed when the unit it is part of is done parsing. Also note that a previously connected parsing unit can not be reconnected; trying to do so will still thrown an ~~UnitAlreadyConnected exception.

connect(u: unit)

Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.

connect_mime_type(b: bytes)

Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.

connect_mime_type(b: string)

Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.

gap(seq: uint<64>, len: uint<64>)

Reports a gap in the input stream. seq is the sequence number of the first byte missing, len is the length of the gap. seq is relative to the sink’s initial sequence number, which defaults to zero.

sequence()

Returns the current sequence number of the sink’s input stream, which is one beyond all data that has been put in order and delivered so far. The returned value is relative to the sink’s initial sequence number, which defaults to zero.

set_auto_trim(enabled: bool)

Enables or disables auto-trimming. If enabled (which is the default) sink input data is trimmed automatically once in-order and procssed. See a trim() for more information about trimming. TODO: Disabling auto-trimming is not yet supported.

set_initial_sequence_number(seq: uint<64>)

Sets the sink’s initial sequence number. All sequence numbers given to other methods are interpreted relative to this one. By default, a sink’s initial sequence number is zero.

set_policy(policy: enum {)

Sets a sink’s reassembly policy for ambigious input. As long as data hasn’t been trimmed, a sink detects overlapping chunks. The policy decides how to handle ambigious overlaps. The default policy is a Spicy::ReassemblyPolicy::First, which resolved ambigiuities by taking the data from chunk that came first. TODO: a First is currently the only policy supported.

skip(seq: uint<64>)

Skips ahead in the input stream. seq is is the sequence number where to continue parsing, relative to the sink’s initial sequence number. If there’s still data buffered before that position, that will be ignored and, if auto-skip is on, also immediately deleted. If new data is passed in later before seq that will likewise be ignored. If the input stream is currently stuck inside a gap, and seq is beyond that gap, the stream will resume processing at seq.

trim(seq: uint<64>)

Deletes all data that’s still buffered internally up to seq. seq is relative to the sink’s initial sequence number, which defaults to zero. If processing the input stream hasn’t reached seq yet, it will also skip ahead to there. Trimming the input stream releases the memory, but means that the sink won’t be able to detect any further data mismatches. Note that by default, auto-trimming is enabled, which means all data is trimmed automatically once in-order and procssed.

try_connect_mime_type(b: bytes)

Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.

try_connect_mime_type(b: string)

Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.

write(b: bytes, seq: [ uint<64> ], len: [ uint<64> ])

Passes data on to all connected parsing units. Multiple write() calls act like passing incremental input in, the units parse them as if it were a single stream of data. If data is passed in out of order, it will be reassembled before passing on, according to the sequence number seq provided; seq is interpreted relative to the inital sequence number set with set_initial_sequence_number, or 0 if not otherwise set. If not sequence number is provided, the data is assumed to represent a chunk to be appended to the current end of the input stream. If len is provided, the data is assumed to represent that many bytes inside the sequence space; if not provided, len defaults to the length of b. If no units are connected, the call does not have any effect. If one parsing unit throws an exception, parsing of subsequent units does not proceed. Note that the order in which the data is parsed to which unit is undefined. Todo: The exception semantics are quite fuzzy. What’s the right strategy here?

2.2.3.14. Time

[TODO: Overview]

Operators

cast<double>(time)

Casts a time into a double.

cast<int>(time)

Casts a time into an integer value, truncating any fractional value.

time coerces to bool

Times coerce to boolean, returning true if the value is non-zero.

time == time

Compares to times.

time > time

Returns whether the first time is larger than the second.

time < time

Returns whether the first time is smaller than the second.

time - time

Subtracts two times.

time - interval

Subtracts an interval from a time.

time + interval

Adds an interval to a time.

interval + time

Adds an interval to a time.

Methods

nsecs()

Returns the time as nanoseconds.

2.2.3.15. Tuple

[TODO: Overview]

Operators

coerces to

Tuples coerce to other tupes if all their elements coerce individually.

==

Compares two tuples for equality.

[int]

Returns the tuple element at a given index.

Methods

None defined.

2.2.3.16. Unit

[TODO: Overview]

Operators

unit.<attr>

Access a unit field.

unit.<attr>=any

Assign a value to a unit field.

unit?.<attr>

Returns true if a unit field is set.

newtype

Instantiates a new parse object for a given unit type.

unit.?<attr>

Returns the value of a unit field if it’s set; otherwise throws an Spicy::AttributeNotSet exception.

Methods

add_filter(f: enum {)

Adds an input filter of type ~~Spicy::Filter to the unit object. The filter will receive all parsed input first, transform it according to its semantics, and then the unit will parse the output of the filter. Multiple filters can be added to a parsing unit, in which case they will be chained into a pipeline and the data is passed through them in the order they have been added. The actual unit parsing will then be carried out on the output of the last filter in the chain. Note that filters must be added before the first data chunk is passed in. If parsing has alrady started when a filter is added, behaviour is undefined. Also note that filters can only be added to exported unit types. Currently, only a set of predefined filters can be used; see ~~Spicy::Filter. One cannot define own filters in Spicy (but one can achieve a similar effect with sinks.) Todo: We should probably either enables adding filters laters, or catch the case of adding them too late at run-time and abort with an exception.

backtrack()

Abort parsing at the current position and returns back to the most revent &try attribute. Turns into a parse error if there’s no &try.

confirm()

Abort parsing at the current position and returns back to the most revent &try attribute. Turns into a parse error if there’s no &try.

disable(msg: string)

Abort parsing at the current position and returns back to the most revent &try attribute. Turns into a parse error if there’s no &try.

disconnect()

Disconnect the unit from its parent sink. The unit gets signaled a regular end of data, so if it still has input pending, that might be processed before the method returns. If the unit is not connected to a sink, the method does not have any effect.

input()

Returns an iter<bytes> referencing the first byte of the raw data for parsing the unit. This method must only be called while the unit is being parsed, and will throw an UndefinedValue exception otherwise. Note that using this method requires the unit being parsed to fully buffer its input until finished. That may have a performance impact, in particular in terms of memory requirements since now the garbage collection may need to hold on to it significantly longer.

mime_type()

Returns the MIME type that was specified when the unit was instantiated (e.g., via ~~sink.connect_mime_type()). Returns an empty bytes object if none was specified. This method can only be called for exported types.

offset()

Returns the an c uint<64> offset of the current parsing position relative to the start of the current parsing unit. This method must only be called while the unit is being parsed, and will throw an UndefinedValue exception otherwise. Note that when being inside a field hook, the current parsing position will have already moved on to the start of the next field because the hook is only run after the current field has been fully parsed. On the other hand, if the method is called from an expression evaluated before the parsing of a field starts (such as in a field’s &length attribute), the returned offset will reflect the beginning of that field. Note that using this method requires the unit being parsed to fully buffer its input until finished. That may have a performance impact, in particular in terms of memory requirements since now the garbage collection may need to hold on to it significantly longer.

set_position(b: iterator<bytes>)

Changes the position in the input stream to continue parsing from. The new position is a new iter<bytes> where subsequent parsing will proceed. Note this changes the position globally: all subsequent field will be parsed from the new position, including those of a potential higher-level unit this unit is part of. Returns an iter<bytes> with the old position. Note that using this method requires the unit being parsed to fully buffer its input until finished. That may have a performance impact, in particular in terms of memory requirements since now the garbage collection may need to hold on to it significantly longer.

2.2.3.17. Vector

[TODO: Overview]

Operators

vector[int]

Returns the vector element at a given index.

vector[int]=any

Assigns an element to the given index of the vector.

|vector|

Returns the length of the vector.

Methods

push_back(elem: any)

Appends an element to the vector.

reserve(c: int)

Resizes the vector to reserver a capacity of at least  c. This shrinks the vector if  c is smaller than the current size.