Top Banner
Proiectul a fost coordonat de Dl. prof.dr. Gilbert Saporta, C.N.A.M. Paris PROIECT ANALIZA DATELOR
22

Proiect Analiza Datelor

May 21, 2017

Download

Documents

shogoruU
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Proiect Analiza Datelor

Proiectul a fost coordonat de

Dl. prof.dr. Gilbert Saporta,

C.N.A.M. Paris

PROIECT ANALIZA DATELOR

Page 2: Proiect Analiza Datelor

IIINNNTTTRRROOODDDUUUCCCEEERRREEE $FHVW� SURLHFW� DUH� FD� VFRS� V � DUDWH� LQWHUHVXO� úL� HILFDFLWDWHD� SH� FDUH� OH��

SUH]LQW � PHWRGHOH� VWDWLVWLFH� GHVFULSWLYH� vQ� DQDOL]D� WDEHOHORU� GH� GDWH� GH�dimensiuni mari.

([HPSOXO� DOHV� VH� UHIHU � OD� SRSXOD LD� D� ��� GH� UL� FDUH� XUPHD] � V � ILH�DQDOL]DWH�vQ�IXQF LH�GH�YDULDELOHOH��

• GHPRJUDILFH��SRSXOD LD�������HVWLPD LD�SHQWUX�������UDWHOH�GH�QDWDOLWDWH�úL�GH�PRUWDOLWDWH��LQGLFHOH�VLQWHWLF�GH�IHFXQGLWDWH���QXP UXO�GH�SHUVRDQH�FX�PDL�SX LQ�GH����DQL��SUHFXP�úL�DO�FHORU�FDUH�DX����GH�DQL�VDX�PDL�PXOW��VSHUDQ D�GH�YLD � Aceste date sunt stabilite de: “Population Reference Bureau” din Washington.

• JHRJUDILFH��VXSUDID ��FRQWLQHQW • HFRQRPLFH��3URGXVXO�1D LRQDO�%UXW�SH�ORFXLWRU�SHQWUX�������vQ�GRODUL��

H[WUDV� GLQ� SXEOLFD LL� DOH� % QFLL� 0RQGLDOH�� 3URGXVXO� ,QWHULRU� %UXW� SH�ORFXLWRU�� SHQWUX� ������ vQ� GRODUL�� OD� SUH XO� úL� SDUitatea puterii de FXPS UDUH� GLQ� ������ IXUQL]DW� GH� &HQWUXO� GH� 6WXGLL� 3URVSHFWLYH� úL� GH�,QIRUPD LL�,QWHUQD LRQDOH�

• VRFLRORJLFH��UHOLJLD�PDMRULWDU �vQ� DU )LLQG�GDW �QDWXUD�FDQWLWDWLY �D�YDULDELOHORU��VH�YD� HIHFWXD� vQ�SULPD�SDUWH��

Analiza în Componente Principale. Scopul este reprezentarea într-XQ�VSD LX�FX�GRX � GLPHQVLXQL� D� GLIHULWHORU� UL�� 9RP� IL� DSRL� GHFL� FRQGXúL� OD� H[DPLQDUHD�D[HORU�SULQFLSDOH�DVWIHO�FUHDWH��SUHFXP�úL�D�FDOLW LL�UHSUH]HQW ULL�YDULDELOHORU�úL�LQGLYL]LORU�vQ�DFHVW�VSD LX�FX�GRX �GLPHQVLXni.

9RP�vQFHUFD�GH�DVHPHQL�V �H[SOLF P�FXP�V-D�UHDOL]DW�SURJQR]D�SRSXOD LHL�vQ�DFHVW�VSD LX�

ÌQ�D�GRXD�SDUWH��FX�DMXWRUXO�XQHL�FODVLILF UL��YRP�RE LQH�SDUWL LD� FHD�PDL�FRHUHQW �D�DFHVWHL�PXO LPL�GH� UL�

Page 3: Proiect Analiza Datelor
Page 4: Proiect Analiza Datelor

ANALIZA ÎN COMPONENTE PRINCIPALE

Analiza în CRPSRQHQWH� 3ULQFLSDOH� HVWH� R� PHWRG � GHVFULSWLY � D� GDWHORU�cantitative. 1.1. Descrierea datelor

1.1.1. Variabilele 6H�GLVWLQJ�GRX �WLSXUL�GH�YDULDELOH�� • variabilele cantitative active, care vor determina axele principale • variabilele ilustrative care prin natura lor (calitative) nu pot participa la

crearea axelor principale. �����(OH�YRU�SXWHD�IL�WRWXúL��vQWU-R�D�GRXD�HWDS ��V �ILH�UHSUH]HQWDWH�vQWU-un cerc GH�FRUHOD LL� �����9RP�UH LQH�GH�DVHPHQL��GUHSW�YDULDELO � LOXVWUDWLY ��R�YDULDELO �FDQWLWDWLY ��progQR]D�SRSXOD LHL�

'H�IDSW��YRP�vQFHUFD�V �H[SOLF P�DFHDVW �YDULDELO �vQ�IXQF LH�GH�YDULDELOHOH�active.

ÌQ�H[HPSOXO�QRVWUX�DP�UH LQXW� 11 variabile active

�����683(5),&��FRQWLQX � �����3238/$7,��FRQWLQX � �����1$7$/,7(��FRQWLQX � �����0257$/,7��FRQWLQX � 8����,1)$17,/��FRQWLQX � �����)(&21',7��FRQWLQX � ����,1����FRQWLQX � ����68����FRQWLQX � ����(63(5$1&��FRQWLQX � ����31%��FRQWLQX � ����3,%��FRQWLQX �

3 variabile ilustrative 1. 5(/,*,21����PRGDOLW L� 2. &217,1(1����PRGDOLW L� 7. 352-(&7,��FRQWLQX �

Page 5: Proiect Analiza Datelor

1.1.2. Indivizii ,QGLYL]LL�VWDWLVWLFL�VXQW� ULOH�����GH� UL�DX�IRVW�UH LQXWH�SHQWUX�DQDOL]D�FX�R�

SRQGHUH�XQLIRUP �HJDO �FX����)UDQ D�QX�SDUWLFLS � OD�DFHDVW �DQDOL] �� úL�YD� IL�WUDWDW �FD�XQ�LQGLYLG�VXSOLPHQWDU�

1.1.3. �6WDWLVWLFD�XQLGLPHQVLRQDO

$FHVW�WDEHO�DUDW �F �YDULDELOHOH�DX�RUGLQH�GH�P ULPH�GLIHULWH�PXOW� $QDOL]D�GDWHORU�GHPRJUDILFH�FX�DMXWRUXO�GLDJUDPHORU�vQ�VWHD��GLQ�FDUH�YH L��

J VL� XQ� H[WUDV� vQ� SDJLQD� ��� IDFH� GHMD� V � DSDU � XUP WRDUHOH� FDUDFWHULVWLFL�� vQ�DGHY U�� VH� SRW� GLVWLQJH� SH� GH� R� SDUWH�� ULOH� FX� R� VSHUDQ � GH� YLD � ULGLFDW ��IRUP � vQ� WULXQJKL�� SUHFXP�&DQDGD��*HUPDQLD� úL� ,WDOLD�� LDU�SH� GH� DOW � SDUWH��ULOH� FDUH� SUH]LQW � R� QDWDOLWDWH�� R�PRUWDOLWDWH� LQIDQWLO � úL� XQ� SURFHQW� GH�PDL�

SX LQ�GH����DQL�SUHD�ULGLFDW�ID �GH�PHGLH��GLDJUDPH�vQ�VWHD��SUHFXP�$OJHULa, $UDELD�6DXGLW �úL�&RDVWD�GH�)LOGHú��

8Q� DOW� JUDILF� �WLS� ³FXWLD� FX� PXVW L´�� FODVHD] � ULOH� vQ� IXQF LH� GH�

GHQVLWDWHD�SRSXOD LHL��SRSXOD LD�UDSRUWDW �OD�VXSUDID ���2� DU �QX�DSDUH�úL�QX�SDUWLFLS �OD�FDOFXO��GDW�ILLQG�FDUDFWHUXO�HL�H[FHS LRQDO��6LQJDSRUe.

4�� ���������SULPD�FXDQWLO � Q2 = 0,0490 (mediana) 4�� ���������D�WUHLD�FXDQWLO � E = Q3-Q1*Q3+1,5Q1=0,2676

$FHVW� JUDILF� DUDW � ULOH� FDUH� DX� R� GHQVLWDWH� IRDUWH� ULGLFDW � vQ� UDSRUW� FX�norma:

Coreea de Sud, ULOH�GH�-RV�

Japonia, Belgia.

Page 6: Proiect Analiza Datelor
Page 7: Proiect Analiza Datelor
Page 8: Proiect Analiza Datelor

1.2 Analiza în Componente Principale

$QDOL]D� vQ� &RPSRQHQWH� 3ULQFLSDOH� HVWH� HIHFWXDW � SOHFkQG� GH� OD� WDEHOXO�GDWHORU� WUDQVIRUPDWH�� ÌQ� DGHY U��GDW� ILLQG�RUGLQXO�GH�P ULPH� IRDUWH�GLIHULW� DO�datelor, se va utiliza metrica inversei dispersiilor, ceea ce revine la efectuarea Analizei în Componente Principale pe date centrate reduse.

0DL�PXOW��WRDWH�GDWHOH�DX�R�SRQGHUH�LGHQWLF �úL�HJDO �FX��� �������,QHU LD�úL�F XWDUHD�D[HORU�SULQFLSDOH

,QHU LD� LQL LDO � D� QRUXOXL� GH� SXQFWH� HVWH� VXPD� SRQGHUDW � � D� S WUDWHORU�GLVWDQ HORU� LQGLYL]LORU� OD� FHQWUXO� GH� JUHXWDWH�� 6H� DUDW � F � DWXQFL� FkQG� GDWHOH�sunt centrate-UHGXVH��DFHDVW �LQHU LH�HVWH�HJDO �FX�QXP UXO�GH�YDULDELOH�DGLF �11.

$�&�3�� FRQVW � vQ� GHWHUPLQDUHD� D[HORU� �QXPLWH� D[H� SULQFLSDOH�� FDUH� YRU�SHUPLWH�PD[LPL]DUHD�LQHU LHL�QRUXOXL�GH�SXQFWH�SURLHFWDW��$FHDVW �PD[LPL]DUH�QHFHVLW � F XWDUHD� YDORULORU� SURSULL� DOH� PDWULFLL� 90� XQGH� 9� HVWH� PDWULFHD�dispersiilor -�FRYDULDQ HORU�úL�0��PDWULFHD�XWLOL]DW �

Aprecierea asupra preciziei calculelor: Urma înainte de diagonalizare 11.00 Suma valorilor proprii 11.00

HISTOGRAMA PRIMELOR 11 VALORI PROPRII

�3ULPHOH�GRX �YDORUL�SURSULL�SHUPLW�GHWHUPLQDUHD�D[HORU�SULQFLSDOH���úL����$FHVWH�GRX �D[H�IRUPHD] �SULPXO�SODQ�SULQFLSDO��3ULPD�D[ �SULQFLSDO �SHUPLWH�UHVWLWXLUHD�D����������GLQ�LQHU LH��úL��D doua 13,80 %.

3ULPXO�SODQ�SULQFLSDO�H[SOLF �GHFL��������GLQ�LQHU LD� WRWDO �D�QRUXOXL�GH�puncte.

+LVWRJUDPD�YDORULORU�SURSULL�IDFH�V �DSDU �R�UXSWXU �GXS �D�GRXD�YDORDUH�SURSULH��$FHVW�FULWHULX�SHUPLWH�GHWHUPLQDUHD�QXP UXOXL�GH�D[H��GH�LQWHUSUHWDW�

Page 9: Proiect Analiza Datelor

1.2.2. Reprezentarea variabilelor ÌQ� VFRSXO� RE LQHULL� XQHL� LQWHUSUHW UL� SHQWUX� D[H�� WUHEXLH� V � UHSUH]HQW P��

variabilele în primul plan principal. 3HQWUX� DFHDVWD��VH� FDOFXOHD] �FRUHOD LLOH� vQWUH�FRPSRQHQWHOH�SULQFLSDOH� úL�

variabile. Aceasta permite determinarea cerculuL�GH�FRUHOD LL��

Page 10: Proiect Analiza Datelor

&DOLWDWHD��UHSUH]HQW ULL�YDULDELOHORU

$FHVW� WDEHO�� GDWRULW � S WUDWHORU� FRUHOD LLORU�� SHUPLWH� LQWHUSUHWDUHD� D[HORU�

H[DPLQkQG�FDOLWDWHD�UHSUH]HQW ULL�YDULDELOHORU� Pe axa 1, variabilele cele mai bine reprezentate sunt Natalitatea,

MortaliWDWHD�LQIDQWLO ��)HFXQGLWDWHD��3URFHQWXO�GH�PDL�SX LQ�GH����DQL��$FHVWH�YDULDELOH� VH� RSXQ� OD� 3URFHQWXO� GH�PDL� PXOW� GH� ��� DQL�� 6SHUDQ D� GH� YLD � úL�31%��YDULDELOH�FDUH�VXQW�PDL�SX LQ�ELQH�UHSUH]HQWDWH�

6H� SRDWH� GD� GHFL� R� LQWHUSUHWDUH� SHQWUX� D[D� ��� DFHDVW � D[ � RSXQH� ULOH�WLQHUH�OD� ULOH�E WUkQH�

3H� D[D� ��� YDULDELOHOH� FHOH� PDL� ELQH� UHSUH]HQWDWH� VXQW� 3RSXOD LD� úL�6XSUDID D�

(VWH� YRUED� GHFL� GH� R� D[ � GH� ³WDOLH´� FDUH� RSXQH� PDULOH� UL� SRSXODWH� OD�celelalte.

Global variabilele cele mai bine reprezentate în planul 1-2 sunt Natalitatea úL� 3URFHQWXO� GH�PDL� SX LQ� GH� ��� DQL�� JUDILF�� HOH� VXQW� FHOH�PDL� DSURSLDWH� GH�FHUFXO�GH�FRUHOD LL�

9DULDELOD� LOXVWUDWLY � 3URMHFWLRQ� � �3URJQR]D� SRSXOD LHL�� HVWH� IRDUWH� VWUkQV�FRUHODW �FX�D[D���úL�IRDUWH�VODE��FX�D[D����(D�HVWH�VODE�FRUHODW �FX�YDULDELOHOH�demografice.

1.2.3. Reprezentarea indivizilor &RRUGRQDWHOH� QHFHVDUH� UHSUH]HQW ULL� LQGLYL]LORU� vQ� SULPXO� SODQ� SULQFLSDO�

VXQW�IXUQL]DWH�vQ�DQH[D����FkW�úL�HOHPHQWHOH�QHFHVDUH�LQWHUSUHW ULL���FRQWULEX LLOH�LQGLYL]LORU�OD�SULPXO�SODQ�SULQFLSDO�úL�FRVLQXVXULOH�S WUDWH�

&RQWULEX LLOH� D[ � FX� D[ � SHUPLW� GHWHUPLQDUHD� LPSRUWDQ HL� LQGLYL]LORU� vQ�FRQVWUXLUHD� D[HORU�� 1X� HVWH� GH� GRULW� FD� XQ� LQGLYLG� V � DLE � R� FRQWULEX LH�H[FHVLY ��$FHDVWD��DU�FRQVWLWXL�XQ�IDFWRU�GH�LQVWDELOLWDWH�

ÌQ� DGHY U�� GDF � UHQXQ P� OD� DFHVW� LQGLYLG� SHQWUX� $�&�3�� YRP� SXWHD� V �GHWHUPLQ P� GLQ� QRX� D[H� FX� R� VHPQLILFD LH� GLIHULW �� *UDILF�� LQGLYL]LL� FX� R�

Page 11: Proiect Analiza Datelor

FRQWULEX LH� SXWHUQLF � VXQW� SH� ³IURQWLHUHOH´� UHSUH]HQW ULL� vQWUXFkW� FRQWULEX LD�individului i pe axa j este raportul dintre coordonata indiYLGXOXL�L�úL�LQHU LD�SXUWDW �SH�D[D�M���FX��R�DSUR[LPD LH�HJDO �FX�SRQGHUHD�LQGLYLGXOXL��

'DF � VH� LDX� GUHSW� FULWHULL� GH� GHFL]LH�� LQGLYL]LL� FDUH� DX� � R� FRQWULEX LH� GH�GRX �RUL�PDL�PDUH�GHFkW�SRQGHULOH�ORU��LQGLYL]LL�FX�R�FRQWULEX LH��IRDUWH�PDUH�pe axa 1 sunt:

• &RDVWD�GH�)LOGHú��0DOL��6HQHJDO��7RJR��(WLRSLD��6RPDOLD��7FKDG�SHQWUX�ULOH�WLQHUH��$FHVWH� UL�LDX�YDORUL�LPSRUWDQWH�SHQWUX�YDULDELOH�FDUH�DX��

GHWHUPLQDW� SR]LWLY� DFHDVW � � D[ � �QDWDOLWDWH�� IHFXQGLWDWH�� SURFHQWXO� GH�PDL�SX LQ�GH����DQL��PRUWDOLWDWH� LQIDQWLO �� úL� VODEH�SHQWUX� �YDULDELOHOH�FDUH�DX�GHWHUPLQDW�QHJDWLY��VSHUDQ D�GH�YLD ��SURFHQWXO�GH�PDL�PXOW�GH����úL�31%��

• 6LWXD LD�HVWH�GLQ�FRQWU ��LQYHUVDW �SHQWUX�-DSRQLD ULOH�FX�R�FRQWULEX LH�PDUH�VXQW��

• &KLQD�� 8566�� ,QGLD�� $FHVWH� UL� � DX� vQ� DFHODúL� WLPS� úL� SRSXOD LH� úL�VXSUDID �PDUH�

$FHVWH� UL�VXQW��vQ�DFHODúL�WLPS�FRQVHUYDWH��UH]XOWDWHOH�$&3�UHI FXW �I U �DFHVWH� UL��U PkQ�LGHQWLFH��

• (PLUDWHOH� $UDEH� 8QLWH� DX� R� VLWXD LH� LQYHUV �� VXSUDID � úL� SRSXOD LH�PLF �

/D�SDVXO�DO�GRLOHD��WUHEXLH�MXGHFDW �FDOLWDWHD�UHSUH]HQW ULL�LQGLYL]LORU� ,QGLYL]LL�LQL LDOL�VLWXD L�vQWU-XQ�VSD LX�FX����GLPHQVLXQL�VXQW�SURLHFWD L�vQWU-

XQ�VSD LX�FX���GLPHQVLXQL��8Q�LQGLYLG�YD�IL�FX�DWkW�PDL�ELQH�UHSUH]HQWDW�FX�FkW�SLHUGHUHD�GLVWDQ HL�HVWH�PDL�PLF ��ÌQ HOHJHP�SULQ�SLHUGHUHD�GLVWDQ HL�GLIHUHQ D�vQWUH�GLVWDQ D� LQGLYLGXOXL� L� OD�RULJLQH� vQ� VSD LXO� FX� ���GLPHQVLXQL� úL� GLVWDQ D�DFHOXLDúL� LQGLYLG� OD� RULJLQH� vQ� SULPXO� SODQ� SULQFLSDO�� &D� R� FRQVHFLQ �� XQ�LQGLYLG� YD� IL� FX� DWkW� PDL� ELQH� UHSUH]HQWDW� FX� FkW� XQJKLXO� vQWUH� LQGLYLG� úL�SURLHF LD� VD� HVWH� PDL� PLF�� VDX� FX� DOWH� FXYLQWH� FX� FkW� S WUDWXO� FRVLQXVXOXL�unghiului este mai mare (aceasta verificându-se în special pentru indivizii GHS UWD L�GH�RULJLQH��

$QH[D� �� FRQ LQH� SLHUGHULOH� GH� GLVWDQ �� &HL� �� LQGLYL]L� FHL� PDL� SURVW�UHSUH]HQWD L� VXQW� vQ� RUGLQH�� (PLUDWHOH�8QLWH�$UDEH�� 6LQJDSRUH�� &ROXPELD� úL�Siria.

Page 12: Proiect Analiza Datelor

1.2.4. Reprezentarea variabilelor ilustrative.

)LHFDUH�YDULDELO �HVWH�UHSUH]HQWDW �SULQ�FHQWUXO�V X�GH�JUHXWDWH�

5HPDUF P�vQ�SDUWLFXODU��F �UHOLJLD�³�KLQGRX´�DUH�DFHOHDúL�FRRUGRQDWH�FD�

IndiD��ÌQ�DGHY U��HVWH�VLQJXUD� DU ��vQ�FDUH�KLQGXLVPXO�HVWH�UHOLJLD�PDMRULWDU �

ÌQ� FRQFOX]LH�� UHSUH]HQW ULOH� JUDILFH� IDF� V � DSDU � FODU� FHOH� WUHL� JUXSH�

GLVWLQFWH�GH� UL�� • FHOH� WLQHUH�� vQ� SULQFLSDO� UL� DIULFDQH�� SX LQ� GH]YROWDWH�� GH� UHOLJLH�

DQLPLVW �VDX�PXVXOPDQ � • FHOH��E WUkQH�� ULOH�GH]YROWDWH��GLQWUH�FDUH� ULOH�HXURSHQH��2FHDQLD�úL�

$PHULFD�6HSWHQWULRQDO ��GH�UHOLJLH�FUHúWLQ • úL�R�D�WUHLD�JUXS ��LQWHUPHGLDU �� ULOH�vQ�FXUV�GH�GH]YROWDUH�DOH�$VLHL�úL�

Americii de Sud.

Page 13: Proiect Analiza Datelor

O clasificare ne va permite, într-o D�GRXD�SDUWH��V �RE LQHP�R�SDUWL LH�PDL�SUHFLV �D�DFHVWRU� UL�

Page 14: Proiect Analiza Datelor

CLASIFICAREA

Clasificarea are ca scop regruparea indivizilor în clase omogene. ([LVW � GRX � PDUL� WLSXUL� GH� PHWRGH�� FODVLILFDUHD� QRQ� LHUDUKLF � FDUH�

SURGXFH� R� SDUWL LH� vQWU-un nuP U� GDW� GH� FODVH�� úL� FODVLILFDUHD� LHUDUKLF � FDUH�SURGXFH�XQ�úLU�GH�SDUWL LL�vQFXLEDWH

Page 15: Proiect Analiza Datelor

2.1. &ODVLILFDUHD�LHUDUKLF �SULQ�PHWRGD�OXL�:DUG

�&ULWHULXO�GH�UHJUXSDUH�D�GRL�LQGLYL]L�VH�ED]HD] �SH�QR LXQHD�GH�LQHU LH��6H�UHJUXSHD] � FHL� GRL� LQGLYL]L�� VDX� VH� FODVHD] � LQGLYL]LL� FDUH� IDF� V � VH� SLDUG �PLQLPXO�GH�LQHU LH�LQWHUFODVH�

,QHU LD� LQWHUFODVH� HVWH�PHGLD� S WUDWHORU� GLVWDQ HORU� FRQWUHORU� GH� JUHXWDWH�GLQ� ILHFDUH� FODV � OD� FHQWUXO� GH� JUHXWDWH� WRWDO�� &D� úL� OD� $�&�3��� YRP� XWLOL]D�metrica inverselor dispersiei.

Page 16: Proiect Analiza Datelor

,QGLFLL� QLYHOXULORU� H[SULP � SLHUGHUHD� GH� LQHU LH� LQWHUFODVH� OD� ILHFDUH�UHJUXSDUH��6XPD�LQGLFLORU�GH�QLYHO�DU�WUHEXL�V �ILH�HJDO �FX�����DGLF �HJDO �FX�LQHU LD�WRWDO �D�QRUXOXL�GH��SXQFWH�

&HL�GRL�LQGLYL]L�FHL�PDL�DSURSLD L�VXQW�GHFL�����úL�����DGLF �3RUWXJDOLD�úL�1RXD�=HHODQG �

ÌQ�HWDSD�XUP WRDUH��QX��PDL��U PkQ�GHFkW����LQGLYL]L�úL�R�FODV �GLQ�FHOH�GRX � UL�UHJUXSDWH�SUHFHGHQW�UHSUH]HQWDW �DWXQFL�SULQ�FHQWUXO�V X�GH�JUHXWDWH��6H� UHJUXSHD] � GLQ� QRX� FHL� GRL� LQGLYL]L� I FkQG� V � VH� SLDUG � FkW� PDL� SX LQ �LQHU LH�LQWHUFODVH��DGLF �5HJDWHOH-8QLWH�úL�'DQHPDUFD�

6H�UHvQFHSH�DSRL�DFHDVW �LWHUD LH�SkQ �FkQG�WR L�LQGLYL]LL�YRU�IL�UHJUXSD L�în interiorul unei singure clase.

&ULWHULXO��³UXSWXULL�³�GH�OD�KLVWRJUDP �SHUPLWH�GHWHUPLQDUHD�QXP UXOXL de FODVH�FH�WUHEXLH�S VWUDW��$FHVW�FULWHULX�DEVROXW�D�FRQVHUYDW���FODVH�

&XORULOH� SHUPLW� XúRU� V � GLVWLQJHP� FHOH� �� FODVH� DVWIHO� FUHDWH�� �FXORULOH�identice au fost folosite în reprezentarea indivizilor în primul plan principal.

ÌQ�EOHX��FODVD����vQ�URúX��FOasa 2, în verde, clasa 3.

Page 17: Proiect Analiza Datelor

2.2. Consolidarea claselor 0HWRGD�FHQWUHORU�PRELOH�SHUPLWH�FRQVROLGDUHD��SDUWL LHL�I FXWH�FX�PHWRGD�

lui Ward. Plecând de la centrele de greutate ale celor 3 clase, se constituie clase noi: indivizii sunt atDúD L� OD� FHQWUXO� GH� JUHXWDWH� � FHO� PDL� DSURSLDW�� 6H�FDOFXOHD] � DSRL� FHQWUHOH� GH� JUHXWDWH� DOH� DFHVWRU� FODVH� QRL� úL� VH� UHvQFHSH�RSHUD LD�SkQ �OD�VWDELOLWDWH�

0HWRGD�FHQWUHORU�PRELOH�FRQYHUJH� IRDUWH� UDSLG��3ULPD�SDUWL LH�HIHFWXDW �

HUD� GHFL� VDWLVI F WRDUH�� 8Q� VLQJXU� LQGLYLG� � D� VFKLPEDW� FODVD�� HVWH� YRUED� GH�$UJHQWLQD�FDUH�WUHFH�GLQ�FODVD���OD� UL�vQ�FXUV�GH�GH]YROWDUH�

�&RQVWUXLUHD�ILQDO �D�FODVHORU�HVWH�GHFL��

Page 18: Proiect Analiza Datelor

Descrierea claselor Mai multe elemente permit caracterizarea claselor diferite create. 3RW� IL� SULYLWH� FRRUGRQDWHOH� úL� YDORULOH� WHVW� DOH� FHQWUHORU� GH� � JUHXWDWH� DOH�

FODVHORU��³LQGLYL]LL��WLSLFL´��PRGDOLW LOH�úL�YDULDELOHOH�FHOH�PDL�FDUDFWHULVWLFH� &ODVHOH���úL���DX�FRRUGRQDWH�GH�YDORUL�ULGLFDWH�GDU�RSXVH�SH�D[D����úL�FX�

valori test ridicate. $[D����SHUPLWH�GHFL�XúRU�GLIHUHQ LHUHD�XúRDU �D�LQGLYL]LORU� 'LQ�FRQWU ��FODVD���HVWH�IRDUWH�³PHGLH´�� Caracterizarea prin “indivizi tipici”

³,QGLYL]LL�WLSLFL´�VXQW�FHL�PDL�DSURSLD L��GH�FHQWUXO�GH�JUHXWDWH�DO�FODVHL� TaEHOHOH� XUP WRDUH� GDX�� SHQWUX� ILHFDUH� FODV �� FHL� �� LQGLYL]L�� FHL� PDL�

FDUDFWHULVWLFL�úL�GLVWDQ D�ORU�OD�FHQWUXO�GH�JUHXWDWH�

Page 19: Proiect Analiza Datelor

�&DUDFWHUL]DUHD�SULQ�YDULDELOH�LOXVWUDWLYH�úL�DFWLYH

8UP WRDUHOH� GRX � WDEHOH� GDX�� SHQWUX� ILHFDUH� FODV �� YDULDELOHOH� FHOH�PDL�PXOW�úL�FHOH�PDL�SX LQ�UHSUH]HQWDWLYH�

'HILQL LH

• &/$�02'�� UDSRUWXO� GLQWUH� QXP UXO� LQGLYL]LORU� DSDU LQkQG� FODVHL� úL�PRGDOLW LL�úL�QXP UXO�GH�LQGLYL]L�DSDU LQkQG�PRGDOLW LL� ([HPSOX����������GLQWUH� ULOH�DVLDWLFH�IDF�SDUWH�GLQ�FODVD�� (10/12=0,8333).

• 02'�&/$6�� UDSRUWXO�GLQWUH�QXP UXO� LQGLYL]LORU� DSDU LQkQG�FODVHL� úL�PRGDOLW LL�úL�QXP UXOXL�GH�LQGLYL]L�DL�FODVHL� Exemplu: 45,45% dintre indivizii clasei 1 sunt asiatice (10/12=0,4545).

&ODVD� �� HVWH� IRUPDW � vQ� SULQFLSDO� GLQ� ULOH� DVLDWLFH� vn timp ce ea nu FRQ LQH�QLFL-R� DU �HXURSHDQ �

&ODVD���HVWH�vQ�VSHFLDO�DIULFDQ �úL�DQLPLVW �úL�IRDUWH�SX LQ�FUHúWLQ � &ODVD� �� HVWH� GLQ� FRQWU � HXURSHDQ � úL� FUHúWLQ � úL� QX� FRQ LQH� QLFL� R� DU �

DIULFDQ �VDX�LVODPLVW �

Page 20: Proiect Analiza Datelor

*OREDO�� VH� UHPDUF � F � DEDWHULOH� PHGLL� S WUDWLFH� DOH� YDULDELOHORU� GLQ�

LQWHULRUXO�XQHL�DFHOHLDúL�FODVH�VXQW�LQIHULRDUH�DEDWHULL�PHGLL�S WUDWLFH�D�vQWUHJLL�PXO LPL�GH�YDULDELOH�

ÌQ�DGHY U�SDUWL LD�FUHHD] �FODVH�PXOW�PDL�RPRJHQH��0DL�PXOW��FODVHOH���úL�3 au “variabile caracterisWLFH�� GDU� vQ� RSR]L LH´�� $FHOHD� FDUH� VXQW� VXSHULRDUH�PHGLHL�SHQWUX�R�FODV ��VXQW�LQIHULRDUH��SHQWUX�FHDODOW ���úL�LQYHUV��&ODVD���HVWH�vQ�DGHY U�R�FODV �LQWHUPHGLDU ��I U �FDUDFWHULVWLFL�ELQH�PDUFDWH�

&ODVLILFDUHD� HVWH� GHFL� FRPSOHPHQWDU � $QDOL]HL� vQ� &RPSRQHnte 3ULQFLSDOH�� (D� YD� SHUPLWH� DILúDUHD� UH]XOWDWHORU� GkQG� FDUDFWHULVWLFL� IRDUWH�SUHFLVH�DVXSUD�FODVHORU��DOH�F URU�FRQWXUXUL�DX�IRVW�WUDVDWH�SULQ�$�&�3�

Concluzie

0HWRGHOH�VWDWLVWLFH�GHVFULSWLYH��GDWRULW � DVSHFWHORU�YL]XDOH� �UHSUH]HQW ULL�JUDILFH� úL� DUERUL� GH� FODVLILFDUH�� úL� LQWXLWLYH�� LPSRUWDQWH�� SHUPLW� GHVFULHUHD�UHODWLY�VLPSOX��D�PXO LPLL�GH�GDWH�FRPSOH[H�

$YDQWDMXO� DFHVWRU�PHWRGH�� vQ� DIDUD� DVSHFWXOXL� GHVFULSWLY�� FRQVW � GHFL� vQ�IDSWXO�F �HOH�VXQW�UHFHSWLELOH�GH�XQ�SXEOLF�ODUJ��QHVSHFLDOL]DW�

Page 21: Proiect Analiza Datelor

Anexe

Page 22: Proiect Analiza Datelor