Cristina Casciano, Viviana De Giorgi, Filippo Oropallo Istat Division for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices First meeting ESSnet on Data Integration Rome 28 – 29 January 2010 The use of administrative and accounts data for business statistics (ESSnet AdminData)
20
Embed
Cristina Casciano, Viviana De Giorgi, Filippo Oropallo
The use of administrative and accounts data for business statistics (ESSnet AdminData). Cristina Casciano, Viviana De Giorgi, Filippo Oropallo Istat Division for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices First meeting ESSnet on Data Integration - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Cristina Casciano, Viviana De Giorgi, Filippo Oropallo
Istat Division for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices
First meeting ESSnet on Data Integration
Rome 28 – 29 January 2010
The use of administrative and accounts data for business statistics (ESSnet AdminData)
Background
ISTAT is about to start a major research project finalised to support the transition of Italian SBS statistics from a data collection system extensively based on direct reporting (involving around 120.000 companies) to a new survey system, which is largely based on the use of administrative data sources
A more intensive use of administrative data for the compilation of SBS statistics requires to carefully evaluate the most relevant administrative data sources with respect to different types of business population (large companies, small and medium size companies, micro-businesses), and to carefully assess the impact of potential sources of biases in those data
Administrative data sources are currently used for the compilation of SBS statistics to produce preliminary estimates, as requested by the SBS Regulation. They are also used as a complementary source of business data with respect to direct reporting to produce definitive SBS data. They are also used for the construction of Italian Business Register (Asia), including demography, and the Oros database (Employment, wage, salary and social contributions STS)
First meeting ESSnet on Data
Integration
Rome, January 29, 2010
ESSnet AdminDataFirst meeting
ESSnet on Data Integration
Rome, January 29, 2010
WP5 - Plan of activities for the period 2010-2013
Assessing the relevance of data matching and definitions inconsistency problems arising when combining multiple administrative data sources (balance sheet data, VAT data, Fiscal authority surveys) with survey data
Design and testing of a methodological approach (and drafting related recommendations) finalized to deal with inconsistency in variables and data matching problems arising from the use of fiscal data in the estimation of SBS data with respect to the smaller size classes (small and micro-businesses)
Designing and testing the appropriate methodological approach (and drafting related recommendations) finalized to statistically amend inconsistency in variables and data matching problems arising from the use of balance sheet data in the estimation of SBS data with respect to the larger size classes (medium and large businesses)
Designing and testing the proper methodological approach (and drafting related recommendations) finalized to properly use complementary fiscal and administrative data sources to estimate specific variables and segments of the SBS target population not covered by the above mentioned data sources.
Defining the final reports and comparison with other EU countries
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Italian Case – Proposed Actions (First Year)
-Integration issues:-Matching issues: detecting different subset of Population (micro-small/medium-large; Unincorporated/Corporated)-Metadata issues: comparison between SBS definitions and Fiscal data
- Review of Administrative sources useful to produce Structural Business Statistics (SBS Eu reg. 58/97, 410/98, 2700/98, 2056/02, 1670/03, 295/2008)
- First step in the reconstruction of the main economic variables for Small firms by using Fiscal sources
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Integration issues (1)
Advantages:- Reduce statistical burden - Reduce bias in estimates (due to TMR)- Reduce costs- Timeliness in producing estimates
Drawbacks:- Confidentiality problems related to the Administrative data access - Administrative data are customarily collected for different purposes- No control on data production process at the origin (to check missing values, outliers, etc.). Cooperation with Agencies that provides data should be considered.- They may refer to legal units not statistical units
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Integration issues (2)
Matching different data sources (statistical/administrative) means tackling a host of issue, e.g.:
Identifying business units i.e. find an identifying variable which is a unique key that is a natural join between different sources. In almost all firm databases we choose the fiscal code (available from Asia)
Dealing with Matching Problemsi.e. whenever a key variable is unavailable or is not sufficient to identify the statistical unit. In case of mis-matches or when sources do not contain the same unit
Identifying changes in business units Changes involving a single unit (changes in kind of business classification, in legal form or localisation)
Changes in the number of units (death, birth, breaks up and splits off, mergers and acquisitions)
Addressing sampling problemsWhen merging survey data with exhaustive data from a subset of the population
Reconciling definitions and values among sources Whenever a variable has not the same definition or value across different sources
Handling data editing and data reconstruction issues Measurement Errors, Missing data, Outliers etc
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Review of sources
1) Fiscal Agency
• Fiscal Survey Purpose Aiming to enhance fiscal complianceNot all firms
• Tax Return data Unico (personal tax), 770 (witholding tax on employees and temporary workersMore info for micro_firms with simplified bookkeeping. Less info for other firms
• VAT data Changes in legal unit and Turnover data
2) Chambers of CommerceBalance Sheet Data
All Corporate firmsBetter coherence with SBS variables
3) Social Security InstituteData from monthly declaration of the enterprise on employeesAll firms with at least 1 employee in a months of the yearNumber of employees, typology, wage and salary, social contributions
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Purpose of Administrative sources To support Tax Admin. control action on small and medium firms
Population coverage Single ownerships, Partnerships and corporate firms Turnover greater than 30.000€ and less than 7,5 million € Roughly 4 million of records
Variables More balance-sheets-comparable variables (Turnover, Value of Production, Intermediate costs, Value Added, Personnel costs, Gross and net operating surplus)Different definition of accounting variables (e.g. Freelancers)
First step in the reconstruction of the main economic variables for SME by using Administrative data (1)
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
First step in the reconstruction of the main economic variables for SME by using Administrative data (2)
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Representative sample of small medium firms (Total 93k, Respondants 44K)
Corporate firms (coverage of financial statements 31k)
Coverage of Fiscal Auth. Survey (Fas 63k)
Coverage Corp+Fas-(Fas∩Corp) (76k)A+B Uncovered (93k-76k=17k), but it is possible a partial reconstruction through Tax returns data
Delimitation of Tax returns data typology : CM (Minimum), RE (Freelancers), RG (Simplified), RF (Ordinary), RS (Companies)
F I S C A L A U T H . S U R V E Y Siz
e
Legal type
F I N A N C I A L
S T A T E M E N T S
Med
ium
Sm
all
Sole proprietorships Corporate firmsPartnerships
Tax return data (RS, RF form)
Tax return data (RE and RG forms)
Minimum taxpayer
BA
Coverage analysis by legal type and size class
First step in the reconstruction of the main economic variables for SME by using Administrative data (3)
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
List of harmonized variables from various sources defined according the SBS regulation and international accounting standard
Description Sme Survey Fiscal (a) Financial Statement Tax return (b)Income from sales and Services fatt_tot_pmi fatt_tot_sdsx fatt_tot_bil fatt_tot_uyyzzChanges in stock of finished and semi-fin. products var_rpfpcl_pmi var_rpfpcl_bilChanges in stock var_riman_pmi var_riman_sdsx var_riman_bil var_riman_uyyzzChanges in contract work in progress var_lavco_pmi var_lavco_sdsx var_lavco_bil var_lavco_uyyzzChanges in internal work capitalized inc_immli_pmi inc_immli_sdsx inc_immli_bilOther income and earnings ric_altri_pmi ric_altri_sdsx ric_altri_bil ric_altri_uyyzzPurchases acq_beni_pmi acq_beni_sdsx acq_beni_bil acq_beni_uyyzzPurchases of goods and services acq_bese_pmi acq_bese_sdsx acq_bese_bil acq_bese_uyyzzGoods for resale CRS353Services (Total) acq_ser_pmi acq_ser_sdsx acq_ser_bil acq_ser_uyyzzUse of third party assets acq_gdbt_pmi acq_gdbt_sdsx acq_gdbt_bil acq_gdbt_uyyzzValue adjustments acq_amm_pmi acq_amm_sdsx acq_amm_bil acq_amm_uyyzzChanges in stocks of raw mat. and for resale var_rmpriv_pmi var_rmpriv_bilChanges in stock for resale var_rriv_uscrsFund allocations acq_acc_pmi acq_acc_sdsx acq_acc_bil acq_acc_uyyzzOther operating charges acq_oneri_pmi acq_oneri_sdsx acq_oneri_bil acq_oneri_uyyzzValue added vagg_pmi vagg_sdsx vagg_bil vagg_uyyzzWages and salarie ret_pmi ret_bilSocial security contributions onerisoc_pmi onerisoc_bilShare of leaving indemnity tfr_pmi tfr_bilPersonnel costs clav_pmi clav_sdsx clav_bil clav2_uyyzzOther personnel costs altrclav_pmi altrclav_bilGross operating surplus margope_pmi margope_sdsx margope_bil margope_uyyzz(a) Sector study form x= enterprise, freelance
(b) Tax return form yy= sole proprietorship, partnership, limited company
zz= frelance, simplified, company
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (4)
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Coverage of the initial sample of SME survey by type of response and administrative data
Non respondents Respondents TotalFinancial Statements 10,370 19,739 30,109
Fiscal Authority Survey (F) 24,655 17,798 42,453
Fiscal Authority Survey (G) 1,343 1,223 2,566
Tax Return data - PF-RG 2,312 990 3,302 Tax Return data - PF-RE 747 483 1,230 Tax Return data - SP-RG 810 378 1,188
Tax Return data - SC-RS 4,546 1,839 6,385
From survey only - 1,251 1,251 Total 44,783 43,701 88,484
Out of coverage and list errors 10,218 No sources 4,337 Total sample units 103,039
Initial theoretitical sample
Source
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (5)
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Integration scheme
63711037025998
841588484
43372503
1.05
Sm
all-
Med
ium
en
terp
rise
s S
urv
ey
Non
-Res
pond
ents
(53
%)
Res
pond
ents
(47
%)
(1,251 survey only)
Survey variables
Financial Statements (10,370)
Fiscal Authority Survey (25,998)
Tax Return data (8,415)
Partial missing responseto impute
Total Missing Response - Not covered (4,337) (5%)
Out of coverage (>100 workers or not active) (10,218)
Information content of SME survey and Administrative Souces (SBS variables)
Tax Return data (3,690)
Fiscal Authority Survey (19,021)
Financial Statements (19,739)
Current imputation (6,371)
Estimation of the source
substitution effect SE=
Σ(Y'-Y)w(S1=
43701)
Estimation of the difference in final estimation DF=ΣY'w'-ΣYwandEvaluation of the Non Response Bias EffectBE=ΣY'(w'-w)(S2=88484)
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (6)
Rome, January 29, 2010
First meeting ESSnet on Data
Integration
Calibration and bias estimation
Final estimates on the subset of respondents (S1)
Final estimates on the integrated sample (S2)
The difference in the final estimation is equal to
In this way we can distinguish, in the final estimated difference, two possible bias due to:
- The source substitution effect for S1 =
- Difference originated from the calibration procedure for S2 =
k S kk S k wywyYYDIFF ~~
1
*
2
**
k S k wyY ~
1
*
2
** ~
k S k wyY
k S kk wyy 1
*
)( *
2
*kk S k wwy
k S k wy 1
* k S k wy 2
*kw
If we add and subtract
is zero for all units of S2 not included in S1, we obtain:where
k S kk S kk S kk S k wywywywyDIFF 2
*
1
*
1
*
2
*
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (7)