Top Banner
Resource-Efficient FPGA Pseudorandom Number Generation usrev Cılasun*, Ivy Peng , Maya Gokhale Lawrence Livermore National Laboratory *University of Minnesota, Twin Cities Introduction I Probability distributions play a critical role in diverse application domains. . In simulations, modeling physical properties of materials, of processes, or of behaviors. . For instance, molecular dynamics codes often utilize the Maxwell-Boltzmann distribution for modeling temperature. I We introduce a resource-efficient hardware pseudo-random number generator (RNG) and two optimizations: . Alias table partitioning: Separates a target distribution into multiple sub-ranges and facilitates local optimizations in each sub-range to improve overall resource utilization . Adaptive threshold resolution: Adjusts bitsize for representing threshold values to the precision of underlying partition I Our main contributions: . Analytic study driven by dual considerations of improving accuracy and hardware mapping optimization . Automated HDL generation of both simulation and synthesis scripts . Diverse use cases: emulating Gaussian delay profile in FPGA-based LiME memory system emulator [1]; random number server for HPC applications Methodology I Walker’s Alias Method [2] is an efficient algorithm for FPGA hardware implementation. It generates arbitrary discrete distributions from uniformly generated random numbers. For a target distribution E (·), this method generates and uses a table of real threshold values F (·) and alternative index values A(·), where F (·), A(·), and E (·) are of the same length. Each output sample Y is generated as Y = ( X U F (X ) A(X ) U > F (X ), where U is a real uniform random number and X is a uniform random integer. The output quality is a function of the precision of U , i.e., increasing the bit size or representing U as a floating-point number [3] improves the quality. I We target following Maxwell-Boltzmann distribution (Eq.1) which has its PDF as a function of temperature T and the Planck distribution (Eq.2) which is parameterized by the factor a. f (x )= 2hc 2 x 5 exp - hc xkT (1) f (x )= r 2 π x 2 exp -x 2 2a 2 a 3 (2) Integration with MATLAB Alias Table Sampling Tcl Elaboration Simulation Walker’s Algorithm Desired Distribution HDL 2 Tests Boilerplate Text MATLAB/Octave Vivado Range Resolution Python Wrapper .csv Sample Count Figure 1: An automated flow of customization and testing PwCLT Architecture URNG-119 mixture_pdf_urng [118:0] c0_mixture_sign_flag [0:0] ROM addr [6:0] data [37:0] c0_alias _index [6:0] alias_table_urng [85:0] bernoulli _fp_urng [78:0] FP Comparator [0:0] [30:0] [6:0] [6:0] + [7:0] - [7:0] - cltfx_urng[31:0] - - “0000000” + “0”&x“00” [7:0] [16:0] [16:0] <<8 [16:0] [7:0] [7:0] [7:0] [8:0] [8:0] [9:0] FP Cast [7:0] [16:0] 4D [6:0] [30:0] Figure 2: PwCLT-8 Architecture[3] for LiME[1] integration. Alias Table Partitioning I We improve the resource utilization for alias tables by separating the target distribution into multiple subranges (four subranges are exemplified in Fig. 3). . In each subrange, the standard alias table implementation is performed. . This separation allows each table to be optimized locally, i.e., alias tables whose target distribution is smoother can be configured to have fewer threshold bits in F (·) table per entry. . Consequently, the alias tables can be selected based on their relative probability range and lifted accordingly. I We propose adaptive threshold resolution to adjust the threshold bitsize while maintaining statistical accuracy. . The quality of the generated samples is determined by the threshold resolution. . When alias table partitioning is employed, partitions with higher variance yield larger bitsize while smaller bitsize is required for those partitions with lower variance. URNG ROM > ROM > ROM > ROM > <c 1 <c 2 <c 3 <c 4 + Encoder 0 N/4 N/2 3N/4 Figure 3: An illustration of alias table partitioning scheme which selectively combines sub distributions by comparing a uniform random variable with CDF values of each distribution in partition boundaries. Validation and Evaluation 0 1 2 3 4 5 6 x 0 1 2 3 4 5 6 Normalized Sample Count 10 -5 Maxwell-Boltzmann Distribution, a=1 MATLAB alias table, floating-point threshold, size=65536 FPGA simulated alias table, fixed-point threshold, size=65536 Ideal Double-precision (a) Maxwell-Boltzman distribution 0 0.5 1 1.5 2 2.5 3 x 10 -5 0 1 2 3 4 5 6 7 8 9 Normalized Sample Count 10 -5 Planck Distribution, T=700K MATLAB alias table, floating-point threshold, size=65536 FPGA simulated alias table, fixed-point threshold, size=65536 Ideal Double-precision (b) Planck distribution (c) Gaussian Latency Histogram in LiME 11 11.5 12 12.5 13 13.5 14 14.5 15 15.5 16 Output bits 0 5 10 15 20 25 30 Memory saving (%) Single Alias Table 2-partitioned 4-partitioned 8-partitioned (d) Memory savings from various partitioning schemes. Conclusion I We introduced a resource-efficient hardware RNG whose accuracy is validated by χ 2 test. I We proposed an alias table partitioning technique for optimizing resource utilization. I Our RNG is evaluated in three use cases for memory emulations and scientific simulations. References [1] A. K. Jain, S. Lloyd, and M. Gokhale. Microscope on memory: Mpsoc-enabled computer memory system assessments. In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 173–180, 2018. [2] Alastair J Walker. An efficient method for generating discrete random variables with general distributions. ACM Transactions on Mathematical Software (TOMS), 3(3):253–256, 1977. [3] D. B. Thomas. FPGA gaussian random number generators with guaranteed statistical accuracy. In 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines, pages 149–156, 2014. Acknowledgments This work was supported by LLNL LDRD 19-ERD-004. LLNL-ABS-813772.
1

Resource-Efficient FPGA Pseudorandom Number Generation

Dec 07, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Resource-Efficient FPGA Pseudorandom Number Generation

Resource-Efficient FPGA Pseudorandom Number GenerationHusrev Cılasun*, Ivy Peng†, Maya Gokhale†

†Lawrence Livermore National Laboratory*University of Minnesota, Twin Cities

Introduction

I Probability distributions play a critical role in diverse application domains.. In simulations, modeling physical properties of materials, of processes, or of behaviors.. For instance, molecular dynamics codes often utilize the Maxwell-Boltzmann distribution

for modeling temperature.I We introduce a resource-efficient hardware pseudo-random number generator (RNG) and

two optimizations:. Alias table partitioning: Separates a target distribution into multiple sub-ranges and

facilitates local optimizations in each sub-range to improve overall resource utilization. Adaptive threshold resolution: Adjusts bitsize for representing threshold values to

the precision of underlying partitionI Our main contributions:. Analytic study driven by dual considerations of improving accuracy and hardware mapping

optimization. Automated HDL generation of both simulation and synthesis scripts. Diverse use cases: emulating Gaussian delay profile in FPGA-based LiME memory system

emulator [1]; random number server for HPC applications

Methodology

I Walker’s Alias Method [2] is an efficient algorithm for FPGA hardware implementation. Itgenerates arbitrary discrete distributions from uniformly generated random numbers. For atarget distribution E(·), this method generates and uses a table of real threshold valuesF (·) and alternative index values A(·), where F (·), A(·), and E(·) are of the samelength. Each output sample Y is generated as

Y =

{X U ≤ F (X )

A(X ) U > F (X ),

where U is a real uniform random number and X is a uniform random integer. The outputquality is a function of the precision of U, i.e., increasing the bit size or representing U asa floating-point number [3] improves the quality.

I We target following Maxwell-Boltzmann distribution (Eq.1) which has its PDF as afunction of temperature T and the Planck distribution (Eq.2) which is parameterized bythe factor a.

f (x) =2hc2

x5exp −

hcxkT

(1)

f (x) =

√2

πx2exp −x2

2a2

a3(2)

Integration with MATLAB

Alias Table Sampling

Tcl ElaborationSimulation

Walker’s Algorithm

Desired Distribution

HDL

𝜒2

Tests

Boilerplate Text

MATLAB/Octave

Vivado

Range

Resolution

PythonWrapper

.csv

Sample Count

Figure 1: An automated flow of customization and testing

PwCLT Architecture

URNG-119

mixture_pdf_urng [118:0]

c0_mixture_sign_flag [0:0]

ROM

addr [6:0]

data [37:0]

c0_alias_index

[6:0]

alias_table_urng [85:0]

bernoulli_fp_urng

[78:0]

FP Comparator

[0:0]

[30:0]

[6:0]

[6:0]+ [7:0]- [7:0]

-

cltfx_urng[31:0]

-

-

“0000000”

+

“0”&x“00”

[7:0]

[16:0]

[16:0]

<<8

[16:0]

[7:0][7:0][7:0]

[8:0][8:0]

[9:0]

FP Cast

[7:0]

[16:0]

4D[6:0]

[30:0]

Figure 2: PwCLT-8 Architecture[3] for LiME[1] integration.

Alias Table Partitioning

I We improve the resource utilization for alias tables by separating the targetdistribution into multiple subranges (four subranges are exemplified in Fig. 3).. In each subrange, the standard alias table implementation is performed.. This separation allows each table to be optimized locally, i.e., alias tables whose target

distribution is smoother can be configured to have fewer threshold bits in F (·) table perentry.

. Consequently, the alias tables can be selected based on their relative probability range andlifted accordingly.

I We propose adaptive threshold resolution to adjust the threshold bitsize whilemaintaining statistical accuracy.. The quality of the generated samples is determined by the threshold resolution.. When alias table partitioning is employed, partitions with higher variance yield larger

bitsize while smaller bitsize is required for those partitions with lower variance.URNG

ROM

>

ROM

>

ROM

>

ROM

>

<c1

<c2

<c3

<c4+

Encoder

0

N/4

N/2

3N/4

Figure 3: An illustration of alias table partitioning scheme which selectively combines subdistributions by comparing a uniform random variable with CDF values of each distribution inpartition boundaries.

Validation and Evaluation

0 1 2 3 4 5 6

x

0

1

2

3

4

5

6

Norm

aliz

ed S

am

ple

Count

10-5 Maxwell-Boltzmann Distribution, a=1

MATLAB alias table, floating-point threshold, size=65536

FPGA simulated alias table, fixed-point threshold, size=65536

Ideal Double-precision

(a) Maxwell-Boltzman distribution

0 0.5 1 1.5 2 2.5 3

x 10-5

0

1

2

3

4

5

6

7

8

9

No

rma

lize

d S

am

ple

Co

un

t

10-5 Planck Distribution, T=700K

MATLAB alias table, floating-point threshold, size=65536

FPGA simulated alias table, fixed-point threshold, size=65536

Ideal Double-precision

(b) Planck distribution

(c) Gaussian Latency Histogram inLiME

11 11.5 12 12.5 13 13.5 14 14.5 15 15.5 16

Output bits

0

5

10

15

20

25

30

Me

mo

ry s

avin

g (

%)

Single Alias Table

2-partitioned

4-partitioned

8-partitioned

(d) Memory savings from variouspartitioning schemes.

Conclusion

I We introduced a resource-efficient hardware RNG whose accuracy is validated by χ2 test.I We proposed an alias table partitioning technique for optimizing resource utilization.I Our RNG is evaluated in three use cases for memory emulations and scientific simulations.

References

[1] A. K. Jain, S. Lloyd, and M. Gokhale.Microscope on memory: Mpsoc-enabled computer memory system assessments.In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines(FCCM), pages 173–180, 2018.

[2] Alastair J Walker.An efficient method for generating discrete random variables with general distributions.ACM Transactions on Mathematical Software (TOMS), 3(3):253–256, 1977.

[3] D. B. Thomas.FPGA gaussian random number generators with guaranteed statistical accuracy.In 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines,pages 149–156, 2014.

Acknowledgments

This work was supported by LLNL LDRD 19-ERD-004. LLNL-ABS-813772.