# OpenWLAN

**Repository Path**: invy/OpenWLAN

## Basic Information

- **Project Name**: OpenWLAN
- **Description**: AI-powered MATLAB-to-HLS framework for WLAN 802.11 synchronization. 3.88x fewer LUTs than HDL Coder on Zynq-7020.
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-14
- **Last Updated**: 2026-03-14

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# OpenWLAN: AI-Powered MATLAB-to-HLS Framework for WLAN Synchronization

[![MATLAB](https://img.shields.io/badge/MATLAB-R2023b+-blue.svg)](https://www.mathworks.com)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![HLS](https://img.shields.io/badge/HLS-Vitis%202024.2-orange.svg)](https://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html)
[![AI-Powered](https://img.shields.io/badge/AI-Powered-purple.svg)](https://github.com/rockyco/OpenWLAN)

## Overview

OpenWLAN demonstrates automated transformation of WLAN 802.11 time and frequency synchronization algorithms from MATLAB to synthesizable HLS C++. The repository contains two complete implementations targeting the Zynq-7020 FPGA:

- **A2H_Coder** - AI-generated HLS C++ via a 7-phase transformation pipeline (all code AI-generated)
- **HDL Coder** - MathWorks HDL Coder reference implementation (Simulink-based)

Both implementations are functionally equivalent. The AI-generated HLS design uses **3.88x fewer LUTs** and **5.48x fewer flip-flops** at comparable Block RAM, fitting comfortably on the Zynq-7020 where the HDL Coder design saturates it.

## Resource Comparison

Post-implementation utilization on Zynq-7020 (`xc7z020clg400-1`), Vivado 2024.2.2:

| Resource | A2H_Coder (HLS) | HDL Coder | Available | HLS Util% | HDL Util% | Ratio (HDL/HLS) |
|----------|-----------------|-----------|-----------|-----------|-----------|------------------|
| LUT | 8,220 | 31,914 | 53,200 | 15.45% | 59.99% | **3.88x** |
| FF | 14,019 | 76,786 | 106,400 | 13.18% | 72.17% | **5.48x** |
| DSP48E1 | 172 | 112 | 220 | 78.18% | 50.91% | 0.65x |
| Block RAM Tile | 17 | 17.5 | 140 | 12.14% | 12.50% | 1.03x |
| Slice | 4,258 | 13,299 | 13,300 | 32.02% | 99.99% | **3.12x** |

The HLS design trades LUT/FF fabric for DSP48E1 hardened multipliers (172 vs 112) - a deliberate architectural choice that is more power-efficient and frees general-purpose resources. HDL Coder at 99.99% slice utilization has no room for additional logic; the HLS design leaves ~68% of slices available for system integration.

### Block RAM Breakdown

| Component | A2H_Coder (HLS) | HDL Coder | Available |
|-----------|-----------------|-----------|-----------|
| RAMB36E1 | 13 | 16 | 140 |
| RAMB18E1 | 8 | 3 | 280 |
| **Block RAM Tiles** | **17** | **17.5** | **140** |

### Timing

| Metric | A2H_Coder (HLS) | HDL Coder |
|--------|-----------------|-----------|
| Target clock | 100 MHz (10 ns) | 100 MHz (10 ns) |
| Fmax achieved | 117.51 MHz | ~100 MHz |
| WNS | +0.427 ns | Met |
| Timing status | Met | Met |

See [Doc/resource_comparison.md](Doc/resource_comparison.md) for the full analysis including per-module breakdown and primitives summary.

![Resource Comparison](Doc/resource_comparison.png)

## System Architecture

Five streaming HLS modules connected in series, all achieving II=1 (one sample per clock):

![System Architecture](Doc/system_architecture.svg)

| Module | Function | DSP | II | Latency |
|--------|----------|-----|----|---------|
| module0_prefilter | 51-tap FIR lowpass filter | 42 | 1 | 0 |
| module1_packet_detect | L-STF autocorrelation packet detection | 20 | 1 | 0 |
| module2_coarse_cfo | Coarse CFO estimation, frequency correction, search buffer extraction | 16 | 1 | 0 |
| module3_fine_sync | L-LTF 160-tap cross-correlation fine timing | 488* | 1 | 167 |
| module4_fine_cfo_apply | Fine CFO estimation and final frequency correction | 16 | 1 | 0 |

*Module 3 csynth DSP estimate (488) assumes fully parallel processing. Post-implementation DSP is much lower because the FIR IP sample_period parameter is set to 8, time-multiplexing DSP units so only 1/8 are required.

Target: Zynq-7020 at 100 MHz. Co-simulation latency: 33,640 cycles for 26,155 input samples.

## Performance Validation

![WLAN Synchronization Analysis](Doc/wlan_sync_analysis.png)

Both implementations target the same MATLAB algorithm (SNR=30 dB, true CFO=10 kHz, timing offset=25 samples):

| Metric | True Value | A2H_Coder HLS | HDL Coder |
|--------|-----------|---------------|-----------|
| Total CFO correction | 10,000 Hz | 9,977 Hz (err: 0.23%) | 9,695 Hz (err: 3.05%) |
| Coarse CFO estimate | 10,000 Hz | 9,804 Hz | - |
| Packet detection | - | Exact | Exact |
| Fine timing | - | Exact | Synchronized |
| Waveform avg error | - | 3.31e-03 | - |

A2H_Coder uses floating-point HLS C++, validated via Vitis co-simulation (source: `system_top.log`). HDL Coder uses Simulink-defined fixed-point, with 9,695 Hz reported in [MathWorks documentation](https://au.mathworks.com/help/wireless-hdl/ug/wlanhdltimeandfrequencysynchronization.html). The CFO difference reflects both arithmetic representation (float vs fixed-point) and different noise realizations. A2H_Coder accuracy is also validated at each transformation stage: modular separation (exact), flattening (<1e-10), optimization (<1e-03). See [Doc/resource_comparison.md](Doc/resource_comparison.md#algorithm-accuracy) for the full analysis.

## Project Structure

```
OpenWLAN/
├── Synchronization/
│   ├── A2H_Coder/                          # AI-generated implementation
│   │   ├── wlanSync.m                      # Original algorithm
│   │   ├── wlanSync_modular.m              # Modularized version
│   │   ├── wlanSync_tb.m                   # Algorithm testbench
│   │   ├── wlanSync_modular_tb.m           # Modular testbench
│   │   ├── wlanSync_testdata_generator.m   # Test vector generation
│   │   ├── module0_prefilter/              # Per-module MATLAB + HLS
│   │   ├── module1_packet_detect/          #   Each contains:
│   │   ├── module2_coarse_cfo/             #     *_flat.m, *_opt.m (public)
│   │   ├── module3_fine_sync/              #     *.cpp, *.hpp (private IP)
│   │   ├── module4_fine_cfo_apply/         #     Makefile, testbenches
│   │   ├── system_wlanSync_integrated/     # Top-level HLS integration
│   │   ├── test_vectors/                   # Golden reference data
│   │   ├── architecture_context.md         # System parameters & dependencies
│   │   └── module_registry.json            # Module metadata & metrics
│   └── HDL_Coder/                          # MathWorks reference (read-only)
│       ├── wlanhdlTimeAndFrequencySynchronization.slx
│       └── hdl_prj/                        # Generated Verilog + Vivado project
└── Doc/
    ├── resource_comparison.md              # Quantitative comparison
    ├── resource_comparison.png             # Resource comparison figure
    ├── system_architecture.svg             # System architecture diagram
    ├── generate_comparison_figure.py       # Figure generation script
    └── wlan_sync_analysis.png              # Performance visualization
```

The MATLAB source, test vectors, and Makefiles are public. HLS C++ sources (`.cpp`, `.hpp`) are private IP and not included in the repository.

## AI Transformation Pipeline

Each module goes through three MATLAB stages before HLS C++ generation:

**Original** - Uses MATLAB toolbox functions (`wlanPacketDetect`, `filter`, etc.)
**Flattened** - All toolbox calls traced and inlined; explicit loops replace vectorized operations
**Optimized** - Streaming architecture with circular buffers, shift registers, and fixed iteration bounds

Example transformation (module1 packet detection):

```matlab
% Original: toolbox call
[startOffset, Mn] = wlanPacketDetect(filteredWaveform, CBW);

% Flattened: inlined autocorrelation
for pos = 1:(nx - 2*symbolLength + 1)
    correlation_sum = sum(conj(x(pos:pos+15)) .* x(pos+16:pos+31));
    power_sum = sum(abs(x(pos+16:pos+31)).^2);
    Mn(pos) = abs(correlation_sum)^2 / (power_sum^2 + eps);
end

% Optimized: streaming with sliding window (incremental update)
for n = 1:nx
    % Shift register update
    corr_sum = corr_sum + conj(delayed_buf(ptr)) * current_buf(ptr) ...
                        - conj(old_delayed) * old_current;
    power_sum = power_sum + abs(current_buf(ptr))^2 - abs(old_current)^2;
    % Division-free threshold: |corr|^2 > T * power^2
    detected = (abs(corr_sum)^2 > threshold * power_sum^2);
end
```

The 7-phase framework pipeline:
1. **Module separation** - Partition monolithic algorithm into streaming modules
2. **Flattening** - Inline toolbox dependencies, eliminate dynamic features
3. **Optimization** - Streaming architecture, circular buffers, math optimizations
4. **Review** - Algorithmic optimization verification (sliding windows, NCO, division avoidance)
5. **HLS generation** - Translate optimized MATLAB to HLS C++ with pragmas
6. **Fixed-point** - Convert to fixed-point types with bit-width optimization
7. **Integration** - System-level integration, co-simulation, implementation

## Getting Started

### Prerequisites

- **MATLAB R2023b+** with Communications Toolbox and Signal Processing Toolbox
- **Vitis HLS 2024.2** (optional, for HLS synthesis and implementation)

### MATLAB Testing

```bash
cd Synchronization/A2H_Coder
```

```matlab
wlanSync_tb                    % Original algorithm testbench
wlanSync_modular_tb            % Modular implementation testbench
wlanSync_testdata_generator    % Regenerate test vectors
```

Per-module testbenches are in each module directory: `module*_flat_tb.m`, `module*_opt_tb.m`.

### HLS Build (Vitis HLS 2024.2)

From any module directory or `system_wlanSync_integrated/`:

```bash
make csim       # C simulation (functional verification)
make csynth     # C synthesis (generate RTL)
make cosim      # C/RTL co-simulation
make impl       # Vivado implementation (place & route)
make report     # Show synthesis resource/timing report
make clean      # Remove build artifacts
```

For fixed-point mode (Phase 6): append `HLS_CFLAGS=-DPHASE6` to any target.

## References

- [MathWorks WLAN HDL Time and Frequency Synchronization](https://au.mathworks.com/help/wireless-hdl/ug/wlanhdltimeandfrequencysynchronization.html)
- IEEE 802.11 Wireless LAN Standard

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- MathWorks for the original WLAN synchronization example
- University of Technology Sydney for research support