PXF: Proto eXpressive Format
Concrete syntax.
PXF is the human-readable text format in the ProtoWire family. The grammar below is written in ISO/IEC 14977 EBNF and matches the canonical reference parser. Whitespace and comments are insignificant between tokens; comments may appear wherever whitespace may appear.
Document
A PXF document is an optional @type directive followed by zero or more entries.
The directive pins the document to a fully-qualified message type; parsers refuse a document
whose entries do not match.
document = [ type_directive ] , { entry } ;
type_directive = '@type' , identifier ;
Entries
Every entry starts with a key. Three operators distinguish the three shapes:
= assigns a scalar or list, : binds a map entry, and a bare
{ … } opens a nested message block.
entry = key , ( assignment_tail | map_tail | block_tail ) ;
assignment_tail = '=' , value ;
map_tail = ':' , value ;
block_tail = '{' , { entry } , '}' ;
key = identifier | string | integer ;
Values
Values are scalars, lists, or block values. Lists accept comma- or newline-separated elements and may freely mix the two; the comma is consumed if present.
value = string
| integer
| float
| bool
| null
| bytes
| timestamp
| duration
| identifier
| list
| block_value ;
list = '[' , [ value , { [ ',' ] , value } ] , ']' ;
block_value = '{' , { entry } , '}' ;
Identifiers
Identifiers carry enum values, message-type names, and bare keys. They begin with a letter or
underscore and may contain dots, which is useful for fully-qualified type names like
infra.v1.ServerConfig.
identifier = ident_start , { ident_part } ;
ident_start = letter | '_' ;
ident_part = letter | digit | '_' | '.' ;
bool = 'true' | 'false' ;
null = 'null' ;
Numbers
Decimal integers and IEEE-754 floats. Floats accept either a decimal point with optional exponent, or an exponent alone.
integer = [ '-' ] , digit , { digit } ;
float = [ '-' ] , digit , { digit } ,
( '.' , { digit } , [ exponent ]
| exponent ) ;
exponent = ( 'e' | 'E' ) , [ '+' | '-' ] , digit , { digit } ;
Timestamps & durations
The lexer recognizes a four-digit year followed by - as an RFC 3339 timestamp,
and a digit run followed by a time unit as a Go-style duration. Negative integers and identifiers
that begin with a letter take precedence over those forms.
(* RFC 3339 date-time. e.g. 2024-01-15T10:30:00Z, 2024-01-15T10:30:00.123456789+02:00 *)
timestamp = ?RFC 3339 date-time? ;
(* Go time.ParseDuration. e.g. 30s, 1h30m, 500ms, 1.5h *)
duration = duration_segment , { duration_segment } ;
duration_segment = digit , { digit } ,
[ '.' , digit , { digit } ] , time_unit ;
time_unit = 'ns' | 'us' | 'µs' | 'ms' | 's' | 'm' | 'h' ;
Strings
Single-quoted simple strings honor C-style escapes plus 2-digit hex, 3-digit octal, and 4/8-digit Unicode escapes. Triple-quoted strings preserve raw content with no escape interpretation; the leading newline is stripped, and the closing line's indent is removed from each preceding line.
string = simple_string | triple_string ;
simple_string = '"' , { string_char | escape_seq } , '"' ;
triple_string = '"""' , ?any text not containing """? , '"""' ;
escape_seq = '\' , ( simple_escape
| hex_escape
| octal_escape
| unicode_4_escape
| unicode_8_escape ) ;
Bytes
Byte literals carry standard or raw base64 with optional padding. Backslashes inside b"…"
are not interpreted.
bytes = 'b' , '"' , { base64_char } , '"' ;
base64_char = letter | digit | '+' | '/' | '=' ;
Comments
Three flavors, freely mixed. Block comments do not nest.
comment = line_comment | block_comment ;
line_comment = ( '#' | '//' ) , { ?any byte except LF? } ;
block_comment = '/*' , { ?any byte? } , '*/' ;
Full railroad diagram
The full diagram covers every production above and a few lexical helpers (character classes, hex/octal digits). It's tall; scroll inside the frame, or open it in a new tab.