10. Grammar

Haka allows the user to describe a full protocol grammar. This section describes all constructs available to build rules for the grammar.

haka.grammar.new(name, descr) → grammar
Parameters:
  • name (string) – Name of the grammar.
  • descr (function) – Function that describe the grammar rules.
Returns:
  • grammar (Grammar) – Created grammar object.

Create a new grammar. The name is mainly used to report detailed information to the user.

The descr parameter is a function that describe the grammar rules. Its environment allows to access all grammar primitives listed in this pages. The functions and variables available in this scope are shown prefixed with grammar.

The grammar description must export public elements in order to be able to use them and to parse some data. To do this, the export() function allow to notify which elements are going to be used for parsing. You should only export root elements that are going to be used as entry point for parsing.

grammar extend(...)
Parameters:
  • ... (Grammar) – List of grammar to extend.

Extend your grammar with some other grammar. It will import named rules and exports from the grammar you extend. You can override any of the imported named rules afterwards.

grammar define(...)
Parameters:
  • ... (string) – List of elements to be defined.

Declare some grammar elements so that they can be used recursively afterwards.

grammar export(...)
Parameters:

Export some grammar elements to be able to use them for parsing.

object Grammar

Object holding grammar rules.

<Grammar>[name] → entity
Parameters:
  • name (string) – Rule name.
Returns:

Get a rule for parsing. This rule must have been exported using the function export().

10.1. Elements

A grammar in Haka is made of various elements that are described in this section. Each one, have different properties.

object GrammarEntity

Object representing any grammar element. This object have different functions that can be used in the grammar specification.

Some options are available to alter the grammar element. The list of options is specified by each grammar element. In addition, the following option are generic and supported by all elements:

<GrammarEntity>:validate(validator) → entity
Parameters:
  • validator (function) – Validator function.
Returns:

Add a validation function for the element. This function is called when a field is mark invalid by setting it to nil.

validator(result)
Parameters:
  • result – Current parsing result.
  • context – Full parsing context.
<GrammarEntity>:convert(converter, memoize=false) → entity
Parameters:
Returns:

Set a conversion operation to apply to the element data.

<GrammarEntity>:memoize() → entity
Returns:

Activate memoization of the value.

<GrammarEntity>:apply(func) → entity
Parameters:
  • func (function) – Apply function.
Returns:

Attach an apply function to the grammar element. It is possible to call apply multiple times to execute more than one function. This function will be called when the value is parsed:

func(value, result, context)
Parameters:
  • value – Current element value.
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
<GrammarEntity>:const(value) → entity
Parameters:
  • value (any or function) – Constant value or function to compute one.
Returns:

Mark this element as constant. Its value will be verified during parsing. The value can be the value to compare to or a function to compute it. In this last case, the function is the following:

value(result, context) → value
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
Returns:
  • value (any) – Constant value to compare to.

10.1.1. Final elements

grammar number(bits, endian='big') → entity
Parameters:
  • bits (number) – Size of the number in bits.
  • endian (string) – Endianness of the raw data: little or big.
Returns:

Usage:

number(8)

Parse a binary number.

grammar token(pattern) → entity
Parameters:
  • pattern (string) – Regular expression pattern for the token.
Returns:

Match a regular expression on the data and return a string.

Note

The regular expression will surrounded by non-capturing group : "^(?:"...")".

Usage:

token('%s+')
grammar raw_token(pattern) → entity
Parameters:
  • pattern (string) – Regular expression pattern for the token.
Returns:

Match a regular expression on the data and return the bytes matched.

Note

The regular expression will surrounded by non-capturing group : "^(?:"...")".

grammar flag
Type:GrammarEntity

Parse a flag of 1 bit and returns it as a boolean.

grammar bytes() → entity
Returns:

Parse a block of data.

Supported options:

<GrammarElement>:count(count) → entity
Parameters:
  • count (number or function) – Number of bytes.
Returns:

Specify the number of element in the bytes array.

The count parameter can be a number or a function called to compute the number of elements:

count(result, context) → count
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
Returns:
  • count (number) – Number of bytes.
<GrammarElement>:chunked(callback) → entity
Parameters:
  • callback (function) – Callback function.
Returns:

This option allows to get each data as soon as they are received in a callback function:

callback(result, sub, islast, context)
Parameters:
  • result – Current parsing result.
  • sub – Current data block.
  • islast – True if this data block is the last one.
  • context (ParseContext) – Full parsing context.
grammar padding_aling(align) → entity
Returns:

Parse some padding. The padding allows to match a bit alignment.

grammar field(name, entity) → entity
Parameters:
  • name (string) – Name of the field in the result.
  • entity (grammar entity) – Entity to named.
Returns:

Create a named entity. This is used to give access to an entity of the grammar. It will then be possible to access to data in the result in a security rule for instance.

Usage:

field("WS", token('%s+'))
grammar verify(verif, msg) → entity
Parameters:
  • verif (function) – Verification function.
  • msg (string) – Error message to report.
Returns:

Verify some property during the parsing. If func returns false, then an error is reported with msg.

verif(result, context) → is_valid
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
Returns:
  • is_valid (boolean) – False if the verification fails.
grammar execute(exec) → entity
Parameters:
  • exec (function) – Generic function.
Returns:

Execute a generic function during the parsing. This allows to deeply customize the parsing using regular Lua functions.

exec(result, context)
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
grammar fail(msg) → entity
Parameters:
  • msg (string) – Error messgae.
Returns:

Always fail the parsing when reaching this element.

10.1.2. Compounds

grammar record(entities) → entity
Parameters:
  • entities (table of grammar entities) – List of entities for the record.
Returns:

Create a record for a list of sub entities. Each entity is expected to appear one by one in order.

When working on a stream, the data behind the elements is kept which allow transparent access and modification.

Supported options:

<GrammarEntity>:extra{...} → entity
Returns:

Each named element in the array will be added as a extra field in the result. The table should only contains functions.

The function are called with the following parameters:

extra(result, context)
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.

Usage:

record{
    field('type', number(8)),
    bytes()
}
grammar sequence(entities) → entity
Parameters:
  • entities (table of grammar entities) – List of entities for the sequence.
Returns:

Create a sequence for a list of sub entities. Each entity is expected to appear one by one in order.

This element is similar to the record() but the data in a stream will immediatly be sent on the network.

Usage:

sequence{
    number(8),
    bytes()
}
grammar union(entities) → entity
Parameters:
  • entities (Table of grammar entities) – List of entities for the union
Returns:

Create a union for a list of sub entities. Each entity will be parsed for the beginning of the union.

haka.grammar.try(cases) → entity
Parameters:
  • cases (Table of grammar entities.) – List of grammar entity to try.
Returns:

Parser will try, in order, each case until one of it finishes successfully.

grammar branch(cases, selector) → entity
Parameters:
  • cases (Associative table of named grammar entities) – Branch cases.
  • selector (function) – Function that will select which case to take.
Returns:

Create a branch. The case to take will be given by the selector function:

selector(result, context) → case
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
Returns:
  • case – The key of the case to select.

A special case named default is used as the default branch if none is found. If this case is set to the string 'continue' the parsing will continue in the case where no valid case is found. If it is not set by the user, a parsing error will be raised.

Usage:

branch({
        num8  = number(8),
        num16 = number(16),
    }, function (result, context)
        return result.type
    end
)
grammar optional(entity, present) → entity
Parameters:
  • entity (grammar entity) – Optional grammar entity.
  • present (function) – Function that will select if the entity should be present.
Returns:

Create an optional entity. This element exists if the present function returns true.

present(result, context) → is_present
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
Returns:
  • is_present (boolean) – True if the element exists.
grammar array(entity) → entity
Parameters:
  • entity (grammar entity) – Entity representing an element of the array.
Returns:

Create an array of a given entity.

Supported options:

<GrammarElement>:count(count) → entity
Parameters:
  • count (number or function) – Number of elements.
Returns:

Specify the number of element in the array.

The count parameter can be a number or a function called to compute the number of elements:

count(result, context) → count
Parameters:
  • result – Current parsing result.
  • context (ParseContext) – Full parsing context.
Returns:
  • count (number) – Number of bytes.
<GrammarElement>:untilcond(check) → entity
Parameters:
  • check (function) – Function to verify a property.
Returns:
check(elem, context) → should_stop
Parameters:
  • elem – Current element of the array.
  • context (ParseContext) – Full parsing context.
Returns:
  • should_stop (number) – Number true when the end of the array is reached.

The check function is evaluated before every elements and should returns true when the last element is reached. When called before the first element, the parameter elem is nil.

<GrammarElement>:whilecond(check) → entity
Parameters:
  • check (function) – Function to verify a property.
Returns:
whilecond(elem, context) → should_continue
Parameters:
  • elem – Current element of the array.
  • context (ParseContext) – Full parsing context.
Returns:
  • should_continue (number) – Number false when the end of the array is reached.

The check function is evaluated before every elements and should returns true if more elements are expected. When called before the first element, the parameter elem is nil.

Usage:

array(number(8))
    :count(10)

10.2. Converters

object Converter

A converter allows to apply some processing to a parsing result value.

<Converter>.get(val)

Compute the converted value from the raw data. This happens when the user tries to get the value of a field for instance.

<Converter>.set(val)

Compute the converted value to store in the raw data. This happens when the user modify the value of on of the field.

Note

If setter is nil then field will be read-only.

Usage:

local my_converter = {
    get = function (val)
        return val:gsub("/", ".")
    end,
    set = nil
}

10.2.1. Predefined converters

haka.grammar.converter.mult(val) → converter
Parameters:
  • val (number) – Multiple to apply to the raw value.
Returns:

Create a converter that will apply a multiplication to the raw data.

haka.grammar.converter.bool
Type:Converter 

Convert the raw value into a boolean.

haka.grammar.converter.tonumber(format, base) → converter
Parameters:
  • format (string) – String format to use when converting from number to string.
  • base (number) – Base to use for the convertion.
Returns:

Convert a raw string value into a number.

10.3. Compiled grammar

object CompiledGrammarEntity

Compiled grammar representation.

<CompiledGrammarEntity>:parse(iter, result=nil, user=nil) → result, error
Parameters:
  • iter (vbuffer_iterator) – Data iterator.
  • result (abstract table) – Object where the parsing result will be stored. If nil, a generic result object will be created.
  • user (table) – User object that will be stored in the parsing context.
Returns:
  • result (table for the result) – The result of the parsing.
  • error (ParseError) – An error if needed.

Parse the data and store all results in the object returned by the function. In case of error, the error desciption is also returned.

<CompiledGrammarEntity>:create(iter, result=nil, init={}) → result, error
Parameters:
  • iter (vbuffer_iterator) – Data iterator.
  • result (abstract table) – Object where the parsing result will be stored. If nil, a generic result object will be created.
  • init (table) – Optional initialization table.
Returns:
  • result (table for the result) – The result of the parsing.
  • error (ParseError) – An error if needed.

Initialize the data from an initialization table and returned the parsing result. In case of error, the error desciption is also returned.

10.4. Parsing error

object ParseError

Parsing error description.

<ParseError>.iterator
Type:vbuffer_iterator 

Iterator at the position where the parsing error occurred.

<ParseError>.rule
Type:string

Name of the rule where the error occurred.

<ParseError>.description
Type:string

Full description of the parsing error.

10.5. Parsing context

object ParseContext

Parsing context used in all parsing related functions.

<ParseContext>:result(index)
Parameters:
  • index (number) – Index of the result in the stack.

Get a parsing result from the stack of results. This stack holds all results created during the parsing for records, arrays...

The index can be a normal index (ie. 1 being the top-level result...) or a pseudo index when it is negative. In this case the return value is the result at the position stating from the last element. For instance -1 is the last result, -2 is the last but one result.

<ParseContext>.user

User object.

<ParseContext>:lookahead() → byte
Returns:
  • byte (number) – Next byte.

Return the next byte. This function can be used to resolve grammar ambiguity.

10.6. Example

This is an example of a very simple grammar expressed in Haka:

local grammar = haka.grammar.new("example", fonction ()
    elem = record{
        field("A", number(32)),
        field("B", number(32))
    }

    block = record{
        field("count", number(32)),
        field("list", array(elem)
            :count(function (self)
                return self.count
            end)
        )
    }

    export(block)
end)