Accelerating PO-SBR for SAR Application Chun Yun Kee, Fu-Gang Hu, Chao-Fu Wang and Tse Tong Chia Temasek Laboratories National University of Singapore 5A Engineering Drive 1, #09-02, Singapore 117411 cykee, fugang, cfwang, [email protected] Abstract — Generating high resolution SAR image of an object typically requires its scattered field data over a large number of aspect angles and frequencies, which is very time consuming to obtain. This paper discusses the implementation of the PO-SBR method on GPU to efficiently compute the massive scattered field data required for SAR application. Index Terms—physical optics; shooting and bouncing rays method; SAR; GPU; CUDA I. INTRODUCTION The PO-SBR technique [1] is very useful and effective for the prediction of electromagnetic scattering of PEC objects at high frequency. In [1-3], the authors introduced OptiX [4] ray-tracing library and various considerations for implementing PO-SBR on GPU with optimal performance for a single frequency. To carry out synthetic aperture radar (SAR) imaging process, this paper will discuss the implementation details of the PO-SBR for generating scattered field data over multiple frequencies for the efficient simulation of SAR image of complex objects such as aircraft. II. BRIEF OVERVIEW OF PO-SBR As described in [1], the induced PO current on the object surface is computed using a closed-form solution based on triangular meshes [5]. An incident plane wave is modelled by a set of parallel optic rays launched towards the object. As induced surface currents reradiate as secondary electromagnetic sources, SBR is employed to account for the contributions of reflected waves. Obeying Snell’s Law, each of these rays is traced until it exits the scene or reaches a maximum ray depth. The PO-SBR code in [1] has been optimized for a single frequency on GPU. Specifically, the ray tube size is determined and ray tracing is performed at each frequency. Ray-trace information is used for scattered field computation and then discarded immediately. III. ACCELERATION OF PO-SBR ON GPU The code in [1] is optimized for a single frequency. Thus, it is not efficient for computation over multiple frequencies as the same rays are repeatedly traced. This section summarizes the considerations discussed in [6] for improving the efficiency of the PO-SBR in [1]. A. Caching of Rays The SBR method is adopted to consider the contributions of reflected waves. An incident plane wave is modeled by a set of parallel optic rays launched towards the object. The main observation for simulation of multiple frequencies is the possible reuse of ray-trace information. Rays launched from the same incident angle need to be traced only once. We cache the rays generated at higher frequencies and reuse them in the field calculation at lower frequencies. In exchange for overall performance boost, we may suffer from the overhead of memory transaction and oversampling. B. Compaction of Rays As soon as the rays are traced, the intermediate information independent of frequency is stored into an array. Typically, some of the initially launched rays will miss the object. Such a scenario gives rise to the issue of divergence on GPU which hurts performance. To resolve this issue, the array of rays is pre-processed before participating in any field calculation. The rays that intersect the target are aggregated to the front of the array. This operation is known as compaction. C. Partition of Rays In most practical cases, the number of rays to be traced is enormous. Constrained by the limited memory on the GPU, the storage of ray information requires careful management to prevent oversubscription to memory. One way of doing it is to partition the rays into batches that fit into the available memory. Suitable partitioning scheme should be chosen to minimize the number of resulting sub-grids, which reduces kernel invocations and the corresponding memory transactions. IV. SAR IMAGE PROCESSING As stated in [7], high-resolution SAR images of targets which have large cross-range extents cannot be possible with narrow-angle data. In this work, the wide-bandwidth large-angle SAR imaging procedure [7] is applied. The look angle range is set to be ] 2 , 0 [ π φ = . Actually, the monostatic SAR image is proportional to the following integral [7] y x y k x k j k d dk e P y x SAR y x + − • = ) ( 2 s F ˆ ) , ( (1) where k e r jkr / E F s s = and P ˆ is the polarization of the scattered far field s E . Eq. (1) indicates that the SAR image is proportional to the Fourier transform of the scattered far fields. The data of far field s E is obtained in the rectangular frequency-aspect domain, but it is transformed into the polar y x k k − format. To perform the DFT, the first-order interpolation is applied to transform the original data to that on the uniform grids of the x k - y k plane. 117 978-1-4673-7297-8/15/$31.00 c 2015 IEEE