Pieter Vlag ESSnet DWH: business register
Pieter Vlag
ESSnet DWH:business register
Outline
• Central role of the statistical units, population frame, which includes number of
enterprises, total turnover derived from the Value Added Tax (VAT) total employment derived from social security data.
in a statistical DWH
• How to deal with different units in different sources ?
• Feedback of revised unit-, population-, turnover- and employment data in DWH to original sources (SBR, VAT, soc security data)
2
Definition of a statistical Datawarehouse
3
The broad definition of a data warehouse to be used in this ESSnet is therefore:
‘A common conceptual model for managing all available data of interest, enabling the NSI to (re)use this data to create new data/new outputs, to produce the necessary information and perform reporting and analysis, regardless of the data’s source.’
The statistical – DWH (1)
4As staging area is “core business” for NSIs, term statistical DWH is used for staging area + WareHouse
The statistical – DWH (2)
5
Necessity of population frame
6
Datasource I:Admin data
Datasource I:Survey 1
Datasource I:Survey 2
Datasource I:BIG DATA
• different sources cover different enterprises -> information about ?• timing of availability sources differs -> when complete desc. available ?
Statistical-DWH with a population frame
7
Pop
ulat
ion.
Dat
asou
rce
1: a
dmin
dat
a 1
Dat
asou
rce
2: B
IG D
ATA
Dat
asou
rce
3: s
urve
y 1
Dat
asou
rce
4:
surv
ey 2
ADVANTAGE:the coverage of DWH is known (e.g. which enterprises are included in a DWH)
Units and target population
8
The population should be known for the preparation phase, integration phase and the actual datawarehouse• datawarehouse; e.g. “about which enterprises info”• its preparation phase ; e.g. when linking data
sources
Population aspects:• Statistical unit (source: SBR)
• Number of enterprises (source: SBR)
• Turnover (source: VAT, via SBR ?)
• Employment (source: soc. sec, via SBR ?)
Proposal I
9
Only statistical unit (=enterprise) is used - for data-linking - processing
in the statistical – DWH
Justification: most obvious, ESSnet on Consistency, maintenance
Ideal world versus reality
10
In the ideal world • only an unique ID for all enterprises exists • the definition of the enterprises corresponds with
the statistical unit
In practice, • several countries don’t have an unique ID • different units exist (legal, tax….. ect.)
Therefore…..
ESSnet DWH – business register11
ENTERPRISE(=statistical unit)
ENTERPRISE GROUP
Legal unit
Legal unit
“Accountìng” unit
“Accountìng” unit
“VAT-unit”
other units
“other tax”units
enterprise
Enterprise
Local unit
LKAU
KAU
Enterprise group
INPUT IN S-DWHprocessing
OUTPUT
Unit base
12
• Complexity of unit base depends on - scope of statistical-DWH- national legislation (practices) with respect to
enterprise units
• Unit base closely related to Business Register. • If compex, recommendation to place this base
outside the Business registers- maintenance- more flexible in case of new in- and outputs- more transparent in case of linking errors
13SBR
Pop-frame
VAT empl.
GSBPM 5.1: link & integrate
GSBPM 5.2-5.6: “process”
GSBPM 5.7-5.8: calculate aggregates
Check processing
“DATAWAREHOUSE”
Position of Business Register in stat -DWH
output 1output 2
output 3
survey
units
tax BIG DATA
other
SBR and statistical-DWH (1)
14
• SBR = source units + population (number enterprises)
• VAT = source turnover• Social security = source employment
Population, turnover and employment together and integrated are the autentative source to which all other data are linked
It is assumed that the autentative source is correct unless otherwise proven
SBR and statistical-DWH (2)
15
Does this mean that the SBR (and VAT and employment registers) is part of the statistical-DWH.
Not necessarily,a copy of the population characteristics for period t
can be derived from the SBR and used in the statistical-DWH
PRO’s easier maintenance, not conflicts with surveys
CON’s feedback to SBR in case of adjustments“SBR outside the statistical-DWH” (~ 50 %
preference of NSIs)
Alternatively, SBR integrated in the SBR “SBR inside the statistical-DWH” (~ 50 %
preference of NSIs) PRO’s no feedback to SBR CON’s maintenance (especially with VAT +
employment)
SBR and statistical-DWH (3)
16
Does this mean that totals of VAT-turnover and “register” employment are calculated within the SBR.
Not necessarily,especially for STS and specialised low aggregate
estimates knowledge of• of (other sources of) the branche, • thorough analyses• Estimation techniquesmay be desired. In thise case a separate system for estimating• VAT-turnover • “register” employment is advised. Decision up to the NSIs.
17SBR
Pop-frame
VAT empl.
GSBPM 5.1: link & integrate
GSBPM 5.2-5.6: “process”
GSBPM 5.7-5.8: calculate aggregates
Check processing
“DATAWAREHOUSE”
Option of definition SBR in stat –DWH (2 extremes)
output 1output 2
output 3
survey
units
tax BIG DATA
other
Feedback to SBR
18
Only if “SBR outside”
In case of conflicting information between datasources and the authentic source (and indirectly SBR), two question
• When incorporating corrections in statistical DWH ?sure of influential error
• When incorporating corrections in SBR?at certain time periods (end of year ect.)
Last slide
19
Thank you for your attention,
any questions or comments...........