POLITECNICO DI MILANO Master of Science in Electronics Engineering Department of Electronics, Information and Bioengineering Optimal Implementation of a Recursive Least Squares Algorithm: TDC case study Supervisor: Prof. Angelo GERACI Master Thesis of: Cumhur ERDİN Number: 835526 Academic Year 2015-2016
89
Embed
Optimal Implementation of a Recursive Least Squares .... Thesis... · Optimal Implementation of a Recursive Least Squares Algorithm: TDC case study ... It is possible to analyse the
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
POLITECNICO DI MILANO
Master of Science in Electronics Engineering Department of Electronics, Information and Bioengineering
Optimal Implementation of a
Recursive Least Squares Algorithm:
TDC case study
Supervisor:
Prof. Angelo GERACI
Master Thesis of:
Cumhur ERDİN
Number: 835526
Academic Year 2015-2016
ii
FOREWORD
I would like to thank my supervisor professor Angelo Geraci for his
guidance and understanding. I would also like to thank Nicola Lusardi,
Digital Electronics Lab family and my friends for their contributions.
Finally, my parents deserve special thanks for their continuous support.
iii
TABLE OF CONTENTS
Page
FOREWORD ii
TABLE OF CONTENTS iii
ABBREVIATIONS v
SUMMARY vii
SOMMARIO ix
ÖZET xi
1. INTRODUCTION 1
2. PROGRAMMABLE LOGIC DEVICES
2.1. History of Programmable Logic 3
2.2. FPGA Architecture 5
2.3. Configuration 6
2.3.1. Languages 6
2.3.1.1. Verilog 6
2.3.1.2. VHDL 7
2.3.1.3. High Level Synthesis 8
2.4. Today’s FPGA 9
3. ALGORITHM OVERVIEW
3.1. Background for Least Squares Method 11
3.2. Least Squares Method 12
3.3. Recursive Least Squares Method 18
3.3.1. Linear Estimation 22
3.3.2. Second Order Polynomial Estimation 24
3.3.3. Gaussian Estimation 26
3.4. Creation of Histogram 31
iv
3.5. RLS Algorithm Implementation 32
3.5.1. 1st Method 34
3.5.2. 2nd Method 35
3.5.3. 3rd Method 37
3.6. Histogram Values 38
3.7. C Interface 43
3.8. Applications of Algorithms 50
4. FPGA IMPLEMENTATION
4.1. Panda Tool 51
4.1.1. C Code Optimization for Bambu 52
4.1.1.1. Finite Precision Effects 52
4.1.1.1.1. Scale Factor Adjustment 53
4.1.1.1.2. Algorithm Modification with Scale Factor 53
4.1.1.2. Optimization of Matrices 56
5. EXPERIMENTAL VALIDATION
5.1. Time to Digital Converter (TDC) 58
5.1.1. Measurement 58
5.1.2. Characteristics of the Data 61
5.1.3. Gaussian Estimation of the Data 64
6. RESULTS 69
7. CONCLUSION AND FUTURE STUDIES 71
REFERENCES 73
INDEX OF FIGURES 75
v
ABBREVIATIONS
1-D : One Directional
2-D : Two Directional
ASIC : Application-specific Integrated Circuit
CLB : Configurable Logic Block
CLK : Clock
CPLD : Complex Programmable Logic Device
DSP : Digital Signal Processing
FA : Full Adder
FF : Flip-flop
FPGA : Field Programmable Gate Array
FSM : Finite State Machine
HDL : Hardware Description Languages
HLS : High-level Synthesis
HW : Hardware
IC : Integrated Circuit
LS : Least Squares
LSE : Least Squares Estimate
LUT : Look-up-table
PAL : Programmable Associative Logic
PLA : Programmable Logic Array
RLS : Recursive Least Squares
RTL : Register Transfer Level
vi
SW : Software
TDC : Time to Digital Converter
TDL : Tapped Delay Line
TI : Time Interval
TIM : Time Interval Meter
VHDL : Very High Speed Integrated Circuit Hardware Description Language
vii
SUMMARY
owadays, data analysis has a wide range usage all around the world
due to its necessity and this usage is increasing day by day. Typically,
different algorithms are used to analyse data and they are becoming more
crucial with the daily increasing new data sources. Data analysis helps to
discover useful information, suggesting conclusions, and supporting decision-
making. Data analysis can be done by various ways as according to the
conditions and demands of different fields such as science, business, social
science dissertation etc. In general, data analysis supports the researcher to
reach a conclusion after the collection of data.
Modelling the data and estimating the model parameters provides a simple
and useful conclusion instead of an enormous number of data set. For the
modelling, priori information can be used to estimate the more accurate
model. On the other hand, if the estimated model is known before, it is not
necessary to use priori information. Estimated model is directly applied to
input data and according to it, unknown parameters can be obtainable.
In this thesis, general-purpose estimation interface is designed to use for any
kind of data and it includes various estimation options. User can easily choose
one or more of the options depending on the application. This choice finds the
unknown parameters of the chosen model and it provides the user with the
graphical interface. It is possible to analyse the final equation and to visualise
the data set and estimation according to the user’s choice.
For the estimation, different algorithms are designed in C to choose from:
o Least Squares Method (LS)
o Recursive Least Squares Method (RLS)
Linear Estimation
Second Order Polynomial Estimation
Gaussian Estimation
N
viii
As a case study, Time to Digital Converter (TDC) is chosen and the
measurement of the TDC is processed as a collection of data. This data set is
processed in designed C code. Working in the real time plays a crucial role, so
the recursive method has been chosen. Firstly, a histogram is obtained to create
the graph data set that will be processed in coded RLS algorithm in C.
According to the obtained graph, Gaussian model is found as a fitting curve.
Gaussian RLS method is used to obtain the unknown parameters of the
Gaussian equation. Same data are processed in MATLAB and the results are
compared with those of the C code. This comparison verified the correctness
of the results obtained in C.
Bambu that is provided by Politecnico di Milano in Linux is used to implement
the designed algorithm in FPGA. Bambu generates HDL description from the
C code. C code has been optimized to make it convertible to Verilog by using
Bambu and functional Verilog code has been obtained.
ix
SOMMARIO
l giorno d’oggi l’analisi dei dati ha un vasto utilizzo in tutto il mondo a
causa della sua necessità e il suo uso incrementa di giorno in giorno.
Tipicamente, vengono impiegati diversi algoritmi per analizzare i dati, ed essi
stanno acquisendo maggiore importanza con l’aumentare quotidiano di nuove
sorgenti di dati. L’analisi dei dati è uno strumento utile per scoprire
informazioni utili, suggerire conclusioni e prendere decisioni; essa può essere
eseguita in diversi modi, a seconda delle condizioni e delle richieste dei vari
campi di ricerca, per esempio scienza, business, scienze umane. In generale,
l’analisi dei dati aiuta il ricercatore a raggiungere una conclusione dopo la
raccolta dei dati.
Costruire un modello dei dati e stimare i parametri del modello fornisce una
conclusione semplice ed efficace invece di un enorme numero di dati. Nella
costruzione del modello, le informazioni a priori possono essere usate per
stimare il modello più accurato. D’altra parte, se il modello stimato è già
conosciuto, l’utilizzo di informazioni a priori non è necessario. Il modello
stimato viene applicato direttamente ai dati inseriti e i parametri sconosciuti
possono essere ottenuti sulla base di esso.
In questa tesi viene progettata un’interfaccia di stima generica applicabile a
ogni tipo di dati, che include diverse opzioni di stima. L’utilizzatore può
facilmente scegliere una o più opzioni, a seconda del campo di applicazioni.
Questa scelta permette di trovare i parametri sconosciuti del modello prescelto
e fornisce l’interfaccia grafica all’utilizzatore. È possibile analizzare
l’equazione finale e visualizzare i dati e la stima a seconda della scelta
dell’utilizzatore.
Per eseguire la stima, è possibile scegliere tra diversi algoritmi progettati in C:
o Metodo dei minimi quadrati
o Metodo dei minimi quadrati ricorsivi
A
x
Stima Lineare
Stima dei Polinomi di Secondo Grado
Stima Gaussiana
Come caso studio è stato scelto un Time to Digital Converter (TDC) e la
misurazione del TDC è stata processata come raccolta di dati. Questo set di
dati è stato processato nel codice C progettato. L’elaborazione in tempo reale
gioca un ruolo fondamentale, cosi è stato scelto il metodo ricorsivo. In primo
luogo, è stato ottenuto un istogramma per creare il grafico dei dati che saranno
processati nell’algoritmo minimi quadrati ricorsivi in C. Sulla base del grafico
ottenuto, è stato identificato un modello Gaussiano come fitting curve. Il
metodo Gaussiano dei minimi quadrati ricorsivi è stato usato per ottenere i
parametri sconosciuti dell’equazione Gaussiana. Gli stessi dati sono stati
processati in MATLAB e i risultati sono stati paragonati a quelli del codice C.
Questa comparazione ha verificato la correttezza dei risultati ottenuti in C.
Per eseguire l'algoritmo progettato in FPGA è stato usato Bambu per Linux
fornito dal Politecnico di Milano. Bambu genera una descrizione HDL dal
codice C. Il codice C è stato ottimizzato per renderlo convertibile in Verilog
utilizzando Bambu ed è stato ottenuto un codice funzionale Verilog.
xi
ÖZET
ünümüzde data analizi gerekliliği nedeniyle dünya genelince geniş bir
kullanıma sahiptir ve bu kullanım gün geçtikçe artmaktadır. Genellikle,
verileri analiz etmek için farklı algoritmalar kullanılmaktadır ve gün geçtikçe
artan veri kaynakları sayesinde algoritmaların kullanımı daha da önem
kazanmaktadır. Veri analizi, faydalı bilgilere ulaşmaya, sonuç çıkarmaya ve
karar vermeye yardımcı olur. Veri analizi farklı koşullarda ve durumlarda
kullanılabilmektedir. Bunlara örnek olarak bilim, ticaret ve sosyal bilimler
gösterilebilir. Genel olarak veri analizi, verilerin elde edilmesinden sonra
araştırmacı kişinin bu verileri doğru kullanarak sonuca ulaşmasını sağlar.
Çok sayıda veri kullanmak yerine verilerin modellenmesi ve model
parametrelerinin elde edilmesi basit bir şekilde faydalı sonuçlara ulaşılmasını
sağlar. Daha doğru modelleme için önceden elde edilen bilgiler kullanılabilir.
Bunun yanı sıra, model önceden biliniyorsa önceden elde edilen bilgileri
kullanmaya gerek yoktur. Tahmin edilen model elde edilen veriler üzerine
uygulandığında bilinmeyen model katsayıları elde edilebilir.
Bu tezde, farklı veri tipleri için kullanılabilir model parametrelerini tespit eden
bir yazılım tasarlanmıştır ve bu yazılım farklı model tipleri içermektedir.
Kullanıcı kendi uygulamasına bağlı olarak kolayca model tiplerinden birini
seçebilir ve model katsayıları elde edebilir. Bu seçim sonucunda bulunan
katsayılar ile birlikte elde edilen model denklemini görmek mümkündür.
Ayrıca elde edilen modeli programlanan kütüphane sayesinde görsel olarak
grafik arayüz ile analiz etmekte mümkündür.
G
xii
Analiz için aşağıda belirtilen farklı algoritmalar C de programlanmıştır.
Kullanıcı ihtiyaca göre seçeneklerden birini seçebilir.
o En küçük kareler yöntemi
o Ardışık en küçük kareler yöntemi
Doğrusal (1. Dereceden) Denklem Tahmini
2. Dereceden Denklem Tahmini
Gauss fonksiyonu Tahmini
Yazılan algoritmanın doğrulanması için Time to Digital Converter (TDC)
projesi seçildi. TDC’den elde edilen veriler tasarlanan C kodunda işlendi. İlk
olarak TDC’den elde edilen veriler ile histogram oluşturuldu. Programın
gerçek zamanda çalışması önemli bir rol oynadığı için ardışık yöntem
algoritma olarak seçildi. Histogram verileri C’de yazılan ardışık en küçük
kareler yöntemi algoritması ile işlendi. Histogram grafiği temel alınarak
verilerin Gauss fonksiyonu oluşturduğu tespit edildi. Model olarak Gauss
fonksiyonu seçildi ve algoritma ile Gauss fonksiyonunun bilinmeyen
katsayıları elde edildi. Aynı veri MATLAB kullanarak analiz edildi ve
sonuçlar C de yazılan algoritma sonuçları ile karşılaştırıldı. Bu karşılaştırma
sonucunda elde edilen verilerin doğruluğu kanıtlandı.
Tasarlanan algoritma Politecnico di Milano tarafından sağlanan Linux
üzerinde kullanılan Bambu kullanılarak FPGA üzerinde gerçeklendi. Bambu
C kodunu kullanarak donanım tanımlama dili(HDL) oluşturmakta kullanıldı.
Tasarlanan C kodu Bambu kullanarak Verilog’a dönüştürülebilir hale
getirmek için optimize edilmiş ve fonksiyonel Verilog kodu elde edilmiştir.
1
1. INTRODUCTION
oday’s one of the popular and crucial topics is data analysis. Here, the
data are analysed on their own terms, necessarily without additional
assumptions [1]. The main reason is the organization and summarization of
the data in ways that bring out their main features and clarify their underlying
structure. Data analysis helps to discover useful information, suggesting
conclusions, and supporting decision-making. Data analysis has multiple
aspects and approaches, including various techniques under a variety of
names, in different business, science and social science domains. Data analysis
is sometimes used as a synonym for data modelling.
Mathematical modelling problems are typically explained as a black box
model or white box models, according to how much a priori information on
the system is available. The black box models do not include any priori
information, instead white box models include all the necessary information.
Therefore, all the systems can be described somewhere between the black box
and white box models.
For the modelling, priori information can be used to estimate the more
accurate model. On the other hand, if the estimated model is known before, it
is not necessary to use priori information. Estimated model is directly applied
to input data and according to the model, unknown parameters can be
obtainable.
After describing the modelling, the next necessary step is estimating the model
parameters. The basic problem is called a parameter identification problem.
Parameter identification plays a crucial role in accurately describing the
system behaviour through mathematical models. It is also noticeable from the
name that the main principle is to identify a best estimate of the values of one
or more parameters in a regression. Different techniques and estimation
T
2
models can be used to solve and describe this problem. In this thesis, some of
the proper methods are discussed and the results of used algorithms are
demonstrated.
In the first part of the thesis, general information and necessary background
information about the programmable logic devices are presented. The second
part includes the explanation of the algorithms. It provides the theoretical
proofs of the algorithms and it explains conceptual terms to better understand
the algorithms and further process. Furthermore, it covers the implementation
of algorithms with some examples. In the part 3.5, different methods are
discussed and in the part 3.6, crucial design choices are explained in detail to
obtain proper histogram. In the last two chapter of part 3, the new C interface
is introduced and it is explained with examples. Then, the applications of
algorithms are presented. In the fourth part, for the Field Programmable Gate
Array (FPGA) implementation, PandA software is introduced. In this chapter,
C code optimization is discussed to create a code that is possible to generate
the HDL description for FPGA implementation. In the last part, experimental
validation is done with the TDC data and the results are compared with
MATLAB.
3
2. PROGRAMMABLE LOGIC DEVICES
2.1 History of Programmable Logic
irst basic programmable logic structures come up late 1960s, Motorola
presented the XC157 that is a mask programmed gate array with 12 gates
and 30 uncommitted input/ output pins in 1969. [2][3]. One year later, Texas
Instruments introduced a mask programmable IC based on the IBM read-only
associative memory. This device has 17 inputs and 18 outputs with 8 JK flip
flop for memory and it is programmed by changing the metal layer during the
production of the IC. After other developments in technology first
‘Programmable Associative Logic Array’ developed by Monolithic Memories
Inc. (MMI) in 1976. GE design environment where Boolean equations would
be converted to mask patterns for configuring the device is supported that
developed device. After PAL structure, another similar device the
programmable logic array (PLA) has introduced. PAL has some advantages
compare to PLA structure such as simple to manufacturers, less expensive and
better performance however it is not flexible as compared PLA, because OR
plane is fixed.
F
4
As it can be understandable from the name that Complex Programmable Logic
Device (CPLDs) are introduced to implement sophisticated type of chip. In this
structure, a CPLD consist of multiple circuit blocks on a single chip and it
includes internal wiring resources to connect the circuit blocks. These blocks
are similar to a PLA or a PAL.
Nowadays, Field Programmable Gate Array (FPGA) are commonly use. These
devices contain an array of programmable logic blocks, gate array like
structure with a hierarchical interconnect arrangement. Compare the other
structures, FPGA includes Look-up-table (LUT) that works like function
generator, it implements a truth table. FPGA is created by logic cells and a
simple logic block commonly includes 4-input LUT, a Full Adder (FA) and a
D-type flip-flop.
Figure 2.1: PLA structure. [4]
5
2.2 FPGA Architecture
ne of the logic device is FPGA. A field programmable gate array (FPGA)
is made of two-dimensional array of generic logic cells and
programmable switches [5]. This logic cells can implement simple function
and these programmable switches can be programmed to provide
interconnections among the logic cells. Specific designs can be obtained by
programming each logic cell and choosing the connection between each
programmable switch. After designing the circuit by using hardware
description language and completing the synthesis, desired logic cell and
switch configuration can be transferred to FPGA and with this step design
completed. The name of the FPGA come from this process, because it can be
done in the field not necessary for a fabrication facility, so it can also be called
like field programmable.
FPGAs include look-up table (LUT) based logic cells which generally includes
a small configurable combinational circuit with a D-type flip-flop. LUT
structure is frequently used for implementing a configurable combinational
circuit. Furthermore, LUT can be described as small memory cells, and with
changing the content of this memory, any n-input combinational function can
be implementable. This “n” number of input depends on the FPGA.
O
Figure 2.2: Logic cell of FPGA [5].
6
The other most embedded blocks on FPGA’s are macro cells. These macro cells
include clock management circuits, memory blocks, I/O interface circuits and
combinational multipliers.
2.3 Configuration
2.3.1 Languages
2.3.1.1 Verilog
erilog is one of the hardware language that used to describe a digital
system [6]. It is invented between 1983 and 1984 by Phil Moorby and
Prabhu Goel [7]. After 1985, it is changed as a hardware modelling language.
The first Verilog simulator was started to use in 1985 and it evolved with time.
Structure of Verilog is similar to C language because of this reason digital
system designers started to use this language frequently. Verilog language has
a case sensitive property like C language and some of its control flow
keywords shows same functionality. On the other hand, it includes some
different properties, in contrast the traditional programming languages,
V
Figure 2.3: FPGA Internal Structure [5].
7
Verilog does not execute the blocks sequentially, so Verilog is called as a
dataflow language. A Verilog design can be created by using hierarchy of
modules, which can communicate with each other through a set of alleged
input, output, and bidirectional ports. However, Verilog designer has to
consider that the blocks themselves are executed concurrently. If the designed
Verilog code includes synthesizable statements, Verilog source code can be
convertible to logically equivalent hardware components with the connections
between them.
2.3.1.2 VHDL
ery High Speed Integrated Circuit Hardware Description Language
(VHDL) is the other hardware description language to design and test
digital circuits. VHDL arose out by the United States Department of Defence
to describe the function and the structure of integrated circuits. This
programming language was started to use 1980’s and with development of
language, it involved in IEEE standards. VHDL is basically one of the parallel
programming language. Like Verilog, VHDL includes hierarchy of modules.
When VHDL used for systems design, it allows designers to observe the
behaviour of the designed system, and it can be described (modelled) and
verified (simulated) before synthesis tools translate the design into real
hardware (gates and wires).
V
8
2.3.1.3 High Level Synthesis
t first, until the 1960s, ICs were designed, optimized, and laid out by
hand in the hardware domain [8][9]. In the early 1970s, it was started to
use simulation at the gate level. By 1979, cycle-based simulation became
available. Later on place-and-route, schematic circuit capture, formal
verification, and static analysis introduced. After 1980s, Hardware Description
Languages (HDLs), such as Verilog and VHDL started to use widely. First
generation high-level synthesis (HLS) tools introduced during the 1990s.
High-level synthesis connects hardware and software domains; moreover, it
improves the productivity for hardware designers who can work at a higher
level of abstraction while creating high-performance hardware [11]. On the
other hand, software designers can accelerate the computationally intensive
parts of their algorithms on the FPGA. High-level synthesis design provides