Packet Encoding Reference#

This document provides a complete specification of all packet and data structure encodings used throughout the hs_bridge software and FPGA Verilog code. Understanding these formats is essential for debugging, extending the system, or implementing compatible software.

Table of Contents#

Host to FPGA Packets
FPGA to Host Packets
HBM Memory Structures
BRAM Memory Structures
PCIe Layer

Host to FPGA Packets#

These are 512-bit command packets created by fpga_compiler.py in hs_bridge and sent to the FPGA via DMA.

Command Packet Format (512 bits)#

┌─────────────────────────────────────────────────────────────┐
│                   512-bit Command Packet                    │
├───────────────┬─────────────────────────────────────────────┤
│ [511:504]     │ Opcode (8 bits)                             │
│ [503:496]     │ Core ID (8 bits)                            │
│ [495:0]       │ Payload (496 bits, opcode-specific)         │
└───────────────┴─────────────────────────────────────────────┘

Field Descriptions:

Opcode [511:504]: 8-bit operation type identifier
Core ID [503:496]: Which FPGA core to target (0-31, though typically only core 0 is used)
Payload [495:0]: Operation-specific data (format varies by opcode)

Opcode Definitions#

Opcode (hex)	Opcode (binary)	Name	Description
`0x00`	`8'h00`	INPUT_SPIKES	Inject external axon spikes into BRAM
`0x01`	`8'h01`	EXECUTE	Run one simulation timestep
`0x02`	`8'h02`	HBM_WRITE	Write data to HBM memory
`0x03`	`8'h03`	HBM_READ	Read data from HBM memory
`0x04`	`8'h04`	URAM_WRITE	Write neuron states to URAM
`0x05`	`8'h05`	URAM_READ	Read neuron states from URAM
`0x06`	`8'h06`	CONFIG_WRITE	Write configuration registers
`0x07`	`8'h07`	CONFIG_READ	Read configuration registers
`0xC8`	`8'hC8`	RESET	Reset FPGA state

Opcode 0x00: INPUT_SPIKES#

Injects external spike events (axon activations) into BRAM for processing.

Payload Format:

[495:480] = Axon ID (16 bits) - which axon is spiking
[479:464] = Spike time (16 bits) - future timestep (optional, usually 0 for immediate)
[463:0]   = Reserved (set to 0)

Example:

# Axon 42 fires at current timestep
opcode = 0x00
core_id = 0x00
axon_id = 42
spike_time = 0

packet = (opcode << 504) | (core_id << 496) | (axon_id << 480) | (spike_time << 464)

What happens:

FPGA’s command_interpreter decodes opcode 0x00
Extracts axon_id from payload
Writes to BRAM address corresponding to axon_id
Sets spike mask bit for this axon

Opcode 0x01: EXECUTE#

Runs one simulation timestep (processes all pending spikes, updates neurons, generates output spikes).

Payload Format:

[495:480] = Number of timesteps (16 bits) - typically 1
[479:0]   = Reserved (set to 0)

Example:

# Execute 1 timestep
opcode = 0x01
core_id = 0x00
num_timesteps = 1

packet = (opcode << 504) | (core_id << 496) | (num_timesteps << 480)

What happens:

FPGA triggers execute state machine
Processes all axon spikes (Phase 1: external_events_processor)
Processes all neuron spikes (Phase 2: internal_events_processor)
Increments execRun_ctr (timestep counter)
Returns spike packets to host via output FIFO

Opcode 0x02: HBM_WRITE#

Writes data directly to HBM memory (for initializing network structure).

Payload Format:

[495:464] = HBM address (32 bits) - byte address in HBM
[463:432] = Length (32 bits) - number of bytes to write
[431:176] = Data (256 bits) - payload data (up to 32 bytes)
[175:0]   = Reserved

Example:

# Write synapse data to HBM
opcode = 0x02
core_id = 0x00
hbm_addr = 0x00008000  # Synapse region start
length = 32  # 32 bytes (8 synapses)
data = [0x00100064, 0x00110064, ...]  # Synapse entries

packet = (opcode << 504) | (core_id << 496) | (hbm_addr << 464) | (length << 432) | (data << 176)

What happens:

command_interpreter routes to HBM write controller
Issues AXI write transaction to HBM
Writes data at specified address

Opcode 0x04: URAM_WRITE#

Writes neuron state (membrane potential) to URAM.

Payload Format:

[495:480] = Neuron ID (16 bits) - which neuron to write
[479:444] = Voltage (36 bits) - membrane potential value (signed)
[443:0]   = Reserved

Example:

# Set neuron 100 voltage to 1000
opcode = 0x04
core_id = 0x00
neuron_id = 100
voltage = 1000  # 36-bit signed value

packet = (opcode << 504) | (core_id << 496) | (neuron_id << 480) | (voltage << 444)

What happens:

command_interpreter routes to URAM write controller
Calculates URAM bank (neuron_id >> 13) and local address (neuron_id & 0x1FFF)
Performs read-modify-write to update only target neuron (2 neurons per URAM word)
Writes back updated 72-bit URAM word

Opcode 0x06: CONFIG_WRITE#

Writes to configuration registers (threshold, leak parameters, etc.).

Payload Format:

[495:480] = Register address (16 bits)
[479:416] = Value (64 bits) - configuration value
[415:0]   = Reserved

Register Map:

Address	Name	Description
`0x0000`	THRESHOLD	Spike threshold (36 bits)
`0x0001`	LEAK_ENABLE	Enable voltage leak (1 bit)
`0x0002`	LEAK_SHIFT	Leak divisor (shift amount)
`0x0003`	RESET_VOLTAGE	Voltage after spike

Example:

# Set threshold to 2000
opcode = 0x06
core_id = 0x00
reg_addr = 0x0000  # THRESHOLD register
value = 2000

packet = (opcode << 504) | (core_id << 496) | (reg_addr << 480) | (value << 416)

FPGA to Host Packets#

These are packets sent from FPGA back to the host, retrieved by fpga_controller.flush_spikes().

Spike Packet Format (512 bits)#

┌─────────────────────────────────────────────────────────────┐
│                   512-bit Spike Packet                      │
├───────────────┬─────────────────────────────────────────────┤
│ [511:496]     │ Tag = 0xEEEE (identifies as spike packet)   │
│ [495:480]     │ Spike count (16 bits) - number of valid     │
│               │ spikes in this packet (0-14)                │
│ [479:32]      │ Spike data: 14 slots × 32 bits each         │
│               │ Each slot: [31:24] = reserved               │
│               │            [23]    = valid bit              │
│               │            [22:6]  = neuron ID (17 bits)    │
│               │            [5:0]   = sub-timestep (6 bits)  │
│ [31:0]        │ Timestep (32 bits) - execRun_ctr value      │
└───────────────┴─────────────────────────────────────────────┘

Field Descriptions:

Tag [511:496]: Always 0xEEEE to identify this as a spike packet
Spike count [495:480]: Number of valid spikes in this packet (1-14)
Spike slots [479:32]: Up to 14 spike entries
- Valid bit [23]: 1 = valid spike, 0 = empty slot
- Neuron ID [22:6]: Which neuron spiked (0-131,071)
- Sub-timestep [5:0]: Fine-grained timing within timestep (usually 0)
Timestep [31:0]: When these spikes occurred (execRun_ctr value)

Example Packet:

Tag: 0xEEEE
Spike count: 3
Spike 0: neuron_id=42,   valid=1, sub_ts=0
Spike 1: neuron_id=1000, valid=1, sub_ts=0
Spike 2: neuron_id=5123, valid=1, sub_ts=0
Spikes 3-13: valid=0 (empty)
Timestep: 1500

Encoded as:

[511:496] = 0xEEEE
[495:480] = 3 (spike count)
[479:448] = 0x00800150  # Spike 0: neuron 42 (0x2A)
[447:416] = 0x00803E80  # Spike 1: neuron 1000 (0x3E8)
[415:384] = 0x00814046  # Spike 2: neuron 5123 (0x1403)
[383:32]  = 0 (empty slots)
[31:0]    = 1500 (timestep)

Python Parsing:

def parse_spike_packet(packet_512bit):
    tag = (packet_512bit >> 496) & 0xFFFF
    if tag != 0xEEEE:
        return None  # Not a spike packet

    spike_count = (packet_512bit >> 480) & 0xFFFF
    timestep = packet_512bit & 0xFFFFFFFF

    spikes = []
    for i in range(14):
        spike_word = (packet_512bit >> (32 + i*32)) & 0xFFFFFFFF
        valid = (spike_word >> 23) & 0x1
        if valid:
            neuron_id = (spike_word >> 6) & 0x1FFFF
            sub_ts = spike_word & 0x3F
            spikes.append({'neuron_id': neuron_id, 'timestep': timestep, 'sub_ts': sub_ts})

    return spikes

HBM Memory Structures#

HBM stores the network structure (pointers and synapses). All addresses are byte addresses.

Memory Map#

┌──────────────────────────────────────────────────────────┐
│ HBM Memory Layout (8 GB total, 2 GB used)               │
├────────────────┬─────────────────────────────────────────┤
│ 0x00000000     │ Region 1: Axon Pointers                 │
│ - 0x00003FFF   │ Size: 16 KB (16,384 bytes)              │
│                │ Format: 32-bit pointers × 512 axons     │
├────────────────┼─────────────────────────────────────────┤
│ 0x00004000     │ Region 2: Neuron Pointers               │
│ - 0x00007FFF   │ Size: 512 KB                            │
│                │ Format: 32-bit pointers × 131,072 neurons│
├────────────────┼─────────────────────────────────────────┤
│ 0x00008000     │ Region 3: Synapses                      │
│ - 0x7FFFFFFF   │ Size: ~2 GB (variable, network-dependent)│
│                │ Format: Variable-length synapse lists   │
└────────────────┴─────────────────────────────────────────┘

Pointer Format (32 bits)#

Pointers are stored in Regions 1 and 2, mapping axon/neuron IDs to their synapse lists.

┌───────────────────────────────────────────────────────────┐
│                  32-bit Pointer Entry                     │
├────────────────┬──────────────────────────────────────────┤
│ [31:23]        │ Length (9 bits) - number of synapse rows│
│ [22:0]         │ Start Address (23 bits) - HBM row index │
│                │ (actual byte address = 0x8000 + addr×32)│
└────────────────┴──────────────────────────────────────────┘

Example:

Axon 5 pointer = 0x00201234
  Length = 0x001 (1 row = 8 synapses)
  Start address = 0x01234 (row index)
  Actual HBM address = 0x8000 + (0x1234 × 32) = 0x2A680

Python Encoding:

def encode_pointer(start_row, num_rows):
    """
    start_row: Row index in synapse region (not byte address)
    num_rows: Number of consecutive rows (each row = 8 synapses)
    """
    length = num_rows & 0x1FF  # 9 bits
    address = start_row & 0x7FFFFF  # 23 bits
    pointer = (length << 23) | address
    return pointer

def decode_pointer(pointer):
    length = (pointer >> 23) & 0x1FF
    start_row = pointer & 0x7FFFFF
    byte_address = 0x8000 + (start_row * 32)
    return {'num_rows': length, 'start_row': start_row, 'byte_address': byte_address}

Synapse Format (32 bits)#

Synapses are stored in Region 3, organized as rows of 8 synapses each (256 bits = 32 bytes per row).

┌───────────────────────────────────────────────────────────┐
│                  32-bit Synapse Entry                     │
├────────────────┬──────────────────────────────────────────┤
│ [31:29]        │ OpCode (3 bits)                          │
│                │   000 = Regular synapse                  │
│                │   100 = Output spike (send to host)      │
│                │   101 = Recurrent connection             │
│ [28:16]        │ Target Address (13 bits)                 │
│                │   For synapse: target neuron ID          │
│                │   For output: neuron to monitor          │
│ [15:0]         │ Weight (16 bits, signed fixed-point)     │
│                │   Interpretation: weight / 32768         │
└────────────────┴──────────────────────────────────────────┘

OpCode Details:

OpCode	Binary	Meaning	Target Field	Weight Field
0	`3'b000`	Regular synapse	Neuron ID (13 bits, 0-8191 within group)	Synaptic weight (signed 16-bit)
4	`3'b100`	Output spike	Neuron ID to report	Unused (set to 0)
5	`3'b101`	Recurrent	Global neuron ID (13 bits)	Synaptic weight

Weight Encoding:

Weights are 16-bit signed integers representing fixed-point values:

Range: -32,768 to +32,767
Interpretation: weight_value / 32768.0
Examples:
- 0x7FFF (32767) → +0.9999… ≈ +1.0
- 0x4000 (16384) → +0.5
- 0x0400 (1024) → +0.03125
- 0x0000 (0) → 0.0
- 0xFC00 (-1024) → -0.03125
- 0x8000 (-32768) → -1.0

Example Synapses:

# Regular synapse: target neuron 42, weight +1000 (≈0.0305)
synapse_1 = (0b000 << 29) | (42 << 16) | 1000
# = 0x002A03E8

# Output spike: report neuron 100
synapse_2 = (0b100 << 29) | (100 << 16) | 0
# = 0x80640000

# Negative weight synapse: target neuron 10, weight -500 (inhibitory)
synapse_3 = (0b000 << 29) | (10 << 16) | ((-500) & 0xFFFF)
# = 0x000AFE0C

Python Encoding:

def encode_synapse(opcode, target, weight):
    """
    opcode: 0=regular, 4=output, 5=recurrent
    target: neuron ID (0-8191 for regular, 0-131071 for global)
    weight: signed integer (-32768 to 32767)
    """
    opcode_bits = (opcode & 0x7) << 29
    target_bits = (target & 0x1FFF) << 16
    weight_bits = weight & 0xFFFF
    synapse = opcode_bits | target_bits | weight_bits
    return synapse

def decode_synapse(synapse):
    opcode = (synapse >> 29) & 0x7
    target = (synapse >> 16) & 0x1FFF
    weight = synapse & 0xFFFF
    # Sign extend weight if necessary
    if weight & 0x8000:  # Negative
        weight = weight - 65536
    return {'opcode': opcode, 'target': target, 'weight': weight}

Synapse Row (256 bits = 8 synapses):

Row at HBM address 0x8000:
  [255:224] = Synapse 7
  [223:192] = Synapse 6
  [191:160] = Synapse 5
  [159:128] = Synapse 4
  [127:96]  = Synapse 3
  [95:64]   = Synapse 2
  [63:32]   = Synapse 1
  [31:0]    = Synapse 0

BRAM Memory Structures#

BRAM stores spike masks for external events (axon spikes).

BRAM Organization#

┌──────────────────────────────────────────────────────────┐
│ BRAM: 32,768 rows × 256 bits per row = 1 MB             │
├────────────────┬─────────────────────────────────────────┤
│ Address        │ Content                                 │
├────────────────┼─────────────────────────────────────────┤
│ 0x0000         │ Axon/Event 0 spike mask                 │
│ 0x0001         │ Axon/Event 1 spike mask                 │
│ ...            │ ...                                     │
│ 0x7FFF         │ Axon/Event 32,767 spike mask            │
└────────────────┴─────────────────────────────────────────┘

Spike Mask Format (256 bits)#

Each row contains a 256-bit bitmask indicating which neuron groups should receive this spike.

┌───────────────────────────────────────────────────────────┐
│              256-bit Spike Mask (one BRAM row)            │
├────────────────┬──────────────────────────────────────────┤
│ [255:240]      │ Group 15 mask (16 bits)                  │
│ [239:224]      │ Group 14 mask (16 bits)                  │
│ ...            │ ...                                      │
│ [31:16]        │ Group 1 mask (16 bits)                   │
│ [15:0]         │ Group 0 mask (16 bits)                   │
└────────────────┴──────────────────────────────────────────┘

Each 16-bit group mask:

Bit 0: First neuron in group should receive spike
Bit 1: Second neuron in group should receive spike
…
Bit 15: 16th neuron in group should receive spike

Note: This is a coarse-grained mask. For fine-grained connectivity, the spike is processed further:

BRAM mask identifies which groups get the spike
For each group, HBM is read to get the full synapse list
Synapse list specifies exact target neurons and weights

Example:

Axon 5 fires, BRAM row 5 contains:
  Group 0 mask: 0x000F (neurons 0-3 in group 0)
  Group 1 mask: 0x0000 (no neurons in group 1)
  Group 2 mask: 0x8000 (neuron 15 in group 2)
  Groups 3-15: 0x0000

This means axon 5 spike should be delivered to:
  - Neurons 0, 1, 2, 3 in group 0
  - Neuron 15 in group 2

Python Encoding:

def encode_bram_mask(group_masks):
    """
    group_masks: list of 16 integers (16-bit masks for each group)
    Returns: 256-bit value
    """
    mask = 0
    for i, group_mask in enumerate(group_masks):
        mask |= (group_mask & 0xFFFF) << (i * 16)
    return mask

def decode_bram_mask(mask_256bit):
    """
    mask_256bit: 256-bit value
    Returns: list of 16 group masks
    """
    group_masks = []
    for i in range(16):
        group_mask = (mask_256bit >> (i * 16)) & 0xFFFF
        group_masks.append(group_mask)
    return group_masks

PCIe Layer#

All communication between host and FPGA travels over PCIe using Transaction Layer Packets (TLPs).

PCIe TLP Format#

hs_bridge and the FPGA do NOT directly create PCIe TLPs - the PCIe hardware handles this automatically. However, understanding the format is useful for debugging.

Memory Write TLP (Host → FPGA MMIO):

┌─────────────────────────────────────────────────────────────┐
│                   PCIe Memory Write TLP                     │
├────────────────────┬────────────────────────────────────────┤
│ Header (3-4 DWords)│                                        │
│ [127:125]          │ Fmt = 010 (write with data, 32-bit addr)│
│ [124:120]          │ Type = 00000 (memory write)            │
│ [95:64]            │ Address (32 bits) - FPGA MMIO address  │
│ [9:0]              │ Length (10 bits) - DWords to transfer  │
├────────────────────┼────────────────────────────────────────┤
│ Data (N DWords)    │ Payload data (up to 4096 bytes)        │
└────────────────────┴────────────────────────────────────────┘

Memory Read TLP (FPGA → Host Memory via DMA):

┌─────────────────────────────────────────────────────────────┐
│                   PCIe Memory Read TLP                      │
├────────────────────┬────────────────────────────────────────┤
│ Header (4 DWords)  │                                        │
│ [127:125]          │ Fmt = 001 (read request, 64-bit addr)  │
│ [124:120]          │ Type = 00000 (memory read)             │
│ [95:0]             │ Address (64 bits) - host DDR4 address  │
│ [9:0]              │ Length (10 bits) - DWords requested    │
└────────────────────┴────────────────────────────────────────┘

Completion TLP (Host → FPGA, returning DMA data):

┌─────────────────────────────────────────────────────────────┐
│                   PCIe Completion TLP                       │
├────────────────────┬────────────────────────────────────────┤
│ Header (3 DWords)  │                                        │
│ [127:125]          │ Fmt = 010 (completion with data)       │
│ [124:120]          │ Type = 01010 (completion)              │
│ [9:0]              │ Byte count (10 bits)                   │
├────────────────────┼────────────────────────────────────────┤
│ Data (N DWords)    │ Requested data from host memory        │
└────────────────────┴────────────────────────────────────────┘

Key Points:

DWord: 32-bit (4-byte) word
Addressing: Can be 32-bit or 64-bit depending on format
Maximum payload: 4096 bytes (4 KB) per TLP
Ordering: Memory writes are posted (no response), reads require completions

hs_bridge’s Role:

hs_bridge does NOT create TLPs directly
When hs_bridge writes to an MMIO address, the OS kernel driver and PCIe hardware create the TLP
When FPGA does DMA, the FPGA’s PCIe hard block creates Memory Read TLPs automatically

Summary: Packet Flow#

Host to FPGA Flow:#

1. Python (hs_bridge):
   packet = create_512bit_command(opcode=0x01, ...)

2. Write to system memory (DDR4):
   dma_buffer[0] = packet

3. Tell FPGA via MMIO (creates PCIe Memory Write TLP):
   fpga.write_register(DMA_ADDR_REG, physical_address)

4. FPGA reads via DMA (creates PCIe Memory Read TLP):
   FPGA → PCIe: "Send me data from address X"

5. Host responds (PCIe Completion TLP):
   Host → FPGA: "Here's the 512-bit packet"

6. FPGA decodes:
   Extracts opcode, routes to appropriate module

FPGA to Host Flow:#

1. Neuron spikes:
   URAM threshold check → spike detected

2. Spike collection:
   Spike FIFO gathers spikes from all neuron groups

3. Packet assembly:
   spike_fifo_controller creates 512-bit spike packet

4. Write to output FIFO:
   Buffered in FPGA FIFO

5. DMA to host (creates PCIe Memory Write TLP):
   FPGA → Host memory: Write spike packet to DMA buffer

6. Host retrieves:
   fpga_controller.flush_spikes() reads from DMA buffer

Quick Reference Tables#

Command Opcodes#

Code	Name	Payload
0x00	INPUT_SPIKES	`[495:480]=axon_id`
0x01	EXECUTE	`[495:480]=num_timesteps`
0x02	HBM_WRITE	`[495:464]=addr, [463:432]=len, [431:176]=data`
0x04	URAM_WRITE	`[495:480]=neuron_id, [479:444]=voltage`
0x06	CONFIG_WRITE	`[495:480]=reg_addr, [479:416]=value`

Synapse OpCodes#

Code	Binary	Meaning
0	`000`	Regular synapse
4	`100`	Output spike (send to host)
5	`101`	Recurrent connection

Memory Regions#

Region	Base Address	Size	Contents
Axon Ptrs	0x00000000	16 KB	Axon → synapse pointers
Neuron Ptrs	0x00004000	512 KB	Neuron → synapse pointers
Synapses	0x00008000	~2 GB	Synapse lists

This reference should provide all the information needed to encode/decode packets and data structures used throughout the hs_bridge and FPGA implementation.

Packet Encoding Reference#

Table of Contents#

Host to FPGA Packets#

Command Packet Format (512 bits)#

Opcode Definitions#

Opcode 0x00: INPUT_SPIKES#

Opcode 0x01: EXECUTE#

Opcode 0x02: HBM_WRITE#

Opcode 0x04: URAM_WRITE#

Opcode 0x06: CONFIG_WRITE#

FPGA to Host Packets#

Spike Packet Format (512 bits)#

HBM Memory Structures#

Memory Map#

Pointer Format (32 bits)#

Synapse Format (32 bits)#

BRAM Memory Structures#

BRAM Organization#

Spike Mask Format (256 bits)#

PCIe Layer#

PCIe TLP Format#

Summary: Packet Flow#

Host to FPGA Flow:#

FPGA to Host Flow:#

Quick Reference Tables#

Command Opcodes#

Synapse OpCodes#

Memory Regions#

This Page