BENVGSA2 MSc Smart Cities and Urban Analytics UCL Urban ...

BENVGSA2

MSc Smart Cities and Urban Analytics

Urban form and density:

A typo-morphological classification of

London’s urban landscapes

Duccio Aiazzi

March 8, 2017

This dissertation is submitted in partial fulfilment of the requirements

for the degree of Master of Science Smart Cities and Urban Analytics from

University College London.

I, Duccio Aiazzi confirm that the work presented in this thesis is my own.

Where information has been derived from other sources, I confirm that this

has been indicated in the thesis.

Word count: 11100

All Ordnance Survey data and the maps derived therefrom: Crown copy-

Abstract

Urban density is widely used to study the city and it is often related

with the urban form. The use of an all-encompassing average measure

to describe the complexity and the variety of the urban landscape has

been questioned by several studies (Alexander 1993, Churchman 1999,

Martin & March 1972) and various authors have proposed the use

of multi-dimensional classifications to better address the problem of

describing and prescribing the urban form. In this work, I present the

results of a typo-morphological classification of neighbourhoods across

the boundaries of Greater London, based on the multi-dimensional

density measure developed by Berghauser Pont & Haupt (2010). The

aim of the work is to critically asses the validity of the method in

delivering a functional classification of different urban landscapes and

understanding how these are associated with population density and

income, by looking at Greater London. Another major point is to

develop an automated method that can cope with storage, analysis

and visualisation of large extents.

Contents

1 Overview 9

1.1 Motivations and research goal . . . . . . . . . . . . . . . . . . 9

1.2 Structure of the study . . . . . . . . . . . . . . . . . . . . . . 10

2 The context 12

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Urban morphology and typo-morphology . . . . . . . . . . . . 16

2.3 Urban science and generative design . . . . . . . . . . . . . . 19

2.4 Urban density . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4.1 Definitions of physical density measures . . . . . . . . 22

2.4.2 Perceive density . . . . . . . . . . . . . . . . . . . . . 25

2.4.3 The problem with density measures . . . . . . . . . . 26

2.5 Multi-dimensional approaches to urban form . . . . . . . . . 28

3 Methodology 29

3.1 Conceptual framework . . . . . . . . . . . . . . . . . . . . . . 29

3.2 Definitions of parameters . . . . . . . . . . . . . . . . . . . . 32

3.3 Definition of classes . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4 Data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5 Data storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.6 Data cleaning and manipulation . . . . . . . . . . . . . . . . 44

4 Analysis 47

4.1 The scale of aggregation . . . . . . . . . . . . . . . . . . . . . 47

4.2 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2.1 Case study I - Angel . . . . . . . . . . . . . . . . . . . 51

4.2.2 Case study II - Bank . . . . . . . . . . . . . . . . . . 52

4.2.3 Case study III - East Croydon . . . . . . . . . . . . . 53

4.2.4 Case study IV - Emerson Park . . . . . . . . . . . . . 54

4.2.5 Case study V - Swiss Cottage . . . . . . . . . . . . . . 55

4.2.6 Case study VI - Other observations . . . . . . . . . . . 56

4.3 Classification summary . . . . . . . . . . . . . . . . . . . . . . 59

4.4 Classes of neighbourhoods and income distribution . . . . . . 61

4.5 Classes of neighbourhoods and population density . . . . . . 65

5 Conclusions 67

6 Appendix 76

6.1 London data: overview maps . . . . . . . . . . . . . . . . . . 76

6.2 Classification results: overview maps . . . . . . . . . . . . . . 79

6.3 Case studies: detailed maps . . . . . . . . . . . . . . . . . . . 83

6.3.1 Case study I: Angel . . . . . . . . . . . . . . . . . . . 84

6.3.2 Case study II: Bank . . . . . . . . . . . . . . . . . . . 87

6.3.3 Case study III: East Croydon . . . . . . . . . . . . . 90

6.3.4 Case study IV : Emerson Park . . . . . . . . . . . . . 93

6.3.5 Case study V : Swiss Cottage . . . . . . . . . . . . . . 96

6.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.4.1 Data import . . . . . . . . . . . . . . . . . . . . . . . 99

6.4.2 Data cleaning . . . . . . . . . . . . . . . . . . . . . . . 113

6.4.3 Classification . . . . . . . . . . . . . . . . . . . . . . . 123

6.4.4 Income and population density . . . . . . . . . . . . . 128

List of Figures

1 What are the specificity of the local culture? . . . . . . . . . 9

2 The White Horse Tavern, located in New York City’s borough

of Manhattan at Hudson Street and 11th Street, in the 60s. . 13

3 Le Corbusier’s vision for the city of the future (Corbusier 1964). 14

4 Scale model of Le Corbusier’s Plan Voisin (Corbusier 1964). . 15

5 The hierarchy of components in the urban fabric. . . . . . . . 16

6 The hierarchy of components in the urban fabric. . . . . . . . 18

7 Perceived density. Contributing factors Alexander (1993). . . 25

8 Three blocks with the same density of 75 dwellings units per

hectare (Per 2008) . . . . . . . . . . . . . . . . . . . . . . . . 28

9 Distribution of building count aggregated by block. . . . . . . 30

10 Space matrix diagram. . . . . . . . . . . . . . . . . . . . . . . 35

11 Correlation plot between FSI and GSI. . . . . . . . . . . . . . 36

12 GSI v number of floors and class definition. . . . . . . . . . . 37

13 GSI distribution of blocks. . . . . . . . . . . . . . . . . . . . . 38

14 Count of the classes by GSI thresholds. . . . . . . . . . . . . 40

15 Street network and street blocks. . . . . . . . . . . . . . . . . 42

16 Classification method Bl: London overview. . . . . . . . . . . 47

17 Classification method Bl: close-up on the city centre. . . . . 48

18 Case study locations. . . . . . . . . . . . . . . . . . . . . . . . 50

19 Bird’s eye view of Angel. . . . . . . . . . . . . . . . . . . . . . 51

20 Bird’s eye view of Bank. . . . . . . . . . . . . . . . . . . . . . 52

21 Bird’s eye view of East Croydon. . . . . . . . . . . . . . . . . 53

22 Bird’s eye view of Emerson Park. . . . . . . . . . . . . . . . . 54

23 Bird’s eye view of Swiss Cottage. . . . . . . . . . . . . . . . . 55

24 Classification method Bl: close-up on the Wembley area. . . 57

25 Classification method Bl: close-up on the Enfield area. . . . . 57

26 Classification method Bl: close-up on the area west of Angel. 58

27 Average FSI by classes of urban typology. . . . . . . . . . . . 59

28 Summary of the income by classes of urban typology. . . . . . 60

29 Blocks counted by class of urban typology. . . . . . . . . . . . 61

30 FSI vs income by classes of urban typology. . . . . . . . . . . 62

31 FSI v median income by borough. . . . . . . . . . . . . . . . 64

32 Population density by classes of urban typology. . . . . . . . 65

33 Built environment density (FSI) v population density. . . . . 66

34 London data: building heights . . . . . . . . . . . . . . . . . . 76

35 London data: population density by LSOA . . . . . . . . . . 77

36 London data: median income by LSOA . . . . . . . . . . . . 78

37 Classification method Bl: London overview . . . . . . . . . . 79

38 Classification method Bl400: London overview . . . . . . . . 80

39 Classification method Pl: London overview . . . . . . . . . . 81

40 Classification method Pl150: London overview . . . . . . . . 82

41 Case study locations . . . . . . . . . . . . . . . . . . . . . . . 83

42 Angel: Satellite view and building heights . . . . . . . . . . . 84

43 Angel: Method Bl by block and Bl400 by block in range 400

m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

44 Angel: Method Pl by plot and Pl150 by plot in range 150 m. 86

45 Bank: Satellite view and building heights . . . . . . . . . . . 87

46 Bank: Method Bl by block and Bl400 by block in range 400

m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

47 Bank: Method Pl by plot and Pl150 by plot in range 150 m. 89

48 East Croydon: Satellite view and building heights . . . . . . . 90

49 East Croydon: Method Bl by block and Bl400 by block in

range 400 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

50 East Croydon: Method Pl by plot and Pl150 by plot in range

150 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

51 Emerson Park: Satellite view and building heights . . . . . . 93

52 Emerson Park: Method Bl by block and Bl400 by block in

range 400 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

53 Emerson Park: Method Pl by plot and Pl150 by plot in range

150 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

54 Swiss Cottage: Satellite view and building heights . . . . . . 96

55 Swiss Cottage: Method Bl by block and Bl400 by block in

range 400 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

56 Swiss Cottage: Method Pl by plot and Pl150 by plot in range

150 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

List of Tables

1 Class definition summary. . . . . . . . . . . . . . . . . . . . . 41

Listings

1 Data import OS TopographicLayer.py . . . . . . . . . . . . . 99

2 Data import InspireCadastralPlots.py . . . . . . . . . . . . . 101

3 Data import OS ITN.py . . . . . . . . . . . . . . . . . . . . . 103

4 Data import BuildingHeights.py . . . . . . . . . . . . . . . . 106

5 Data import LondonAdministrativeBoundaries.py . . . . . . 108

6 Data createTopology.sql . . . . . . . . . . . . . . . . . . . . . 113

7 Data JoinBuildingHeights.sql . . . . . . . . . . . . . . . . . . 117

8 Data createBuildingShapesTable.sql . . . . . . . . . . . . . . 118

9 Data joinBlocks.sql . . . . . . . . . . . . . . . . . . . . . . . . 119

10 Data joinBlockRange400.sql . . . . . . . . . . . . . . . . . . . 121

11 Data classification blocks.py . . . . . . . . . . . . . . . . . . . 123

12 Data joinIncomeTable.sql . . . . . . . . . . . . . . . . . . . . 128

13 Data classification income.py . . . . . . . . . . . . . . . . . . 129

1 Overview

1.1 Motivations and research goal

The present study is set in the context of urban planning and design with ref-

erence to the field of typo-morphology. Because of their nature of complex

systems, it is debatable whether a city is more designable than a society

or an ecosystems. Marshall (2009) talks about cities as both “designed”

and “organic”. This means that, at the same time, cities evolve uncon-

trolled and are shaped by a conscious effort, emerge from local dynamics

but also from deliberate actions of creation. In both cases, the question

about the optimal urban form is more about finding a balance between dif-

ferent types of neighbourhoods rather one single unifying solution. It is

also a question of understanding the specificity of local cultures and existing

situations. In order to do so, whether for prescription or description, it is

useful to classify the urban typo-morphology and more so using a quanti-

tative approach that tries to bring objectivity in a field often dominated

by qualitative approaches and diverging schools of thought (Moudon 1994).

Image taken from the webpagehttp://www.masterplanningthefuture.org/?p=2700

Figure 1: What are the specificity of

the local culture?

Simplicity should also be paramount,

when there is need to overcome ex-

cessive simplifications in the prac-

tice of urban studies derived by

the omnipresent use of a one-catch-

all index such as urban density,

which is used for its immediate-

ness but often imprecisely and with-

out a shared definition (Churchman

1999).

In the following pages, I present the results of a typo-morphological classi-

fication of neighbourhoods across the boundaries of Greater London, based

on the multi-dimensional density measure developed by Berghauser Pont &

Haupt (2010). The aim of the work is to critically asses the validity of the

method in delivering a functional classification of different urban landscapes

and understanding how these are associated with population density and

income. Another major point is to develop an automated method that can

cope with storage, analysis and visualisation of big amount of data in order

to be able to apply the method over large geographical extents.

1.2 Structure of the study

Section 2 provides an introduction to the context by describing the main

references the form the base of this work. It starts by schematically outlin-

ing the debate around the urban form, then describing the basic concepts

of urban typo–morphology, which is the discipline that deals with the de-

scription and classification of the urban form. It also summarises the recent

surge of interest in generative processes and computational methods, and

finally defines the concept of urban density, some of the debate surrounding

it and some proposed alternatives.

The first half of Section 3 describes the rationale behind this study and

the methodology of investigation. The second part deals with the technical-

ities: the data sources, the tools used for the analysis, the workflow and the

manipulation of the data.

In Section 4 I apply the classification to Greater London. The section

first discusses the different scales of aggregation, their merits and problems.

Then, it describes in detail five case studies giving an interpretation of the

results, it summarises the results by classes and finally analyses the relation-

ship between the classification and two demographic measures: population

density and median income.

Section 5 summarises the results and describes the limitations.

Appendix contains the maps that were not included in the main text and

the Python and SQL code used to perform the storage, cleaning and analysis

of the data.

2 The context

2.1 Introduction

There is a growing consensus on defining the city by the processes happening

within its location rather than by the physical environment. Batty (2013)

suggests that “instead of thinking of cities as sets of spaces, places, locations,

we need to think them as sets of actions, interactions and transactions” and

that “location is, in effect, a synthesis of interactions”.

In the city multiple interacting systems transfer flows of information and

goods back and forth, at different time scales, leading to constant changes

in the environment. Depending on the system, changes happen more or less

rapidly: information and people follow fast dynamics, land-uses changes less

frequently and the infrastructures follow even slower dynamics. A key el-

ement in understanding the city is the interaction between these slow/fast

dynamics, which is the mutual relationship between the processes and the

built environment. In order to respond to the optimisation question posed

by the process of managing, planning and designing parts of cities, it is

important to understand how the built environment shapes and is shaped

by the flow of actions. In other words, the interaction between form and

function.

This is, of course, an old question which according to Schumacher (2011)

defines the very essence of architecture and urban design. It is a question

that has been addressed in the past by looking at the city as a deterministic

machine, one that can be tuned and shaped around men’s actions, follow-

ing a reductionist approach well synthesised in the statement “form follows

function” of the modernist movement in architecture and urban design. Le

Corbusier and the other major protagonists of the modernist school realised

the importance of flows and actions in a city and designed grand plans stud-

ied to respond to deterministic problems. They failed, however, at dealing

with the complex nature of the urban systems by ignoring the other side

of the relationship: how the form affects the function. The rigorous zoning

of residential and commercial areas, their separation and the uniformity of

the interventions lead to the reduction of the complexity of the urban envi-

ronment which in turn resulted in dormitory neighbourhoods disconnected

from the daily productive activities. By trying to tame the inherent chaos

of the cities, they did not realise that the chaos was exactly what gives cities

“The stretch of Hudson Street where Ilive is each day the scene of an intricatesidewalk ballet” Jacobs (1961).

Figure 2: The White Horse Tavern,

located in New York City’s borough

of Manhattan at Hudson Street and

11th Street, in the 60s.

In complete opposition to the sani-

tising plans of the modernist era,

Jacobs (1961) describes a city where

street life is the catalyst of the ac-

tivities that make a city successful

and diversity is what fuels the pro-

cess. In spite of the mainstream

idea of her times of urban “hy-

giene” that wanted a city defined by

rigorous zones of uniform land use

(Sir Howard 1898, Corbusier 1964),

she argues that diversity and self-

organising complexity are the defin-

ing features of the urban environment and its ultimate goal. She carefully

draws a pragmatic picture of the street life, where small decisions affected

by apparently meaningless constrains multiply their effect by aggregation

and interaction to give form to what she calls the urban ballet. Her work

gradually gained momentum and her influence can be seen in several move-

ments that called for a city modelled on the human scale and on small

pedestrian movements. To name a few, the Compact City (Dantzig & Saaty

1973, Williams et al. 1996), the Smart Growth (Maryland Department of

Planning 1997) or the New Urbanism (Leccese & McCormick 2000).

Figure 3: Le Corbusier’s vision for the city of the future (Corbusier 1964).

These two radically opposing views of what the city is, differ also in

terms of the ingredients that contribute to its success. The form of the

built environment, the relationship between its elements and the context

is one key aspect of these differences. The modernist city is based on

fast transportation where the street is the mean to reach the destination,

not much of a place on its own. In its more extreme version, the mod-

ernist city is spread out to give room to green parks scattered with semi-

autonomous residential units provided with all the basic features and ser-

vices. It is also a city strictly designed with a top-down process with clear

pre-determined zoning. The image where Le Corbusier illustrates the plan

“ville radieuse” for Paris (figure 4, p.15), which seems to suggest the gesture

of a man literarlly single-handedly donating a new future to the city. In

Jacobs’ view, instead, the path to a successful city goes through a healthy

street life made of a constant movement of people incentivised by mixed

and multiple activities happening along the street and inside the buildings.

Figure 4: Scale model of Le Corbus-

ier’s Plan Voisin (Corbusier 1964).

For street life to happen cities must

achieve a certain density threshold

that can sustain the diversity of ac-

tivities and the interaction between

people. The city has to be walkable,

composed with short blocks, have

a hierarchy of services (from local

to primary functions for the whole

city), buildings of different age and

condition. This view, though, is not presented as a unique, unifying model

but as guidelines whose main purpose is to achieve multiplicity of choice for

the users and in the built environment as well.

2.2 Urban morphology and typo-morphology

“Typo–moprhological studies revel the physical and spatial structure of cities.

They are typological and morphological because they describe urban form

(morphology) based on a detailed classification of buildings and open spaces

by type (typology)” (Moudon 1994).

Moudon (1994) identifies three founding schools of the discipline centred in

Italy, England and France, each of them addressing different issues with dif-

ferent methodologies. The main protagonist of the three schools are respec-

tively S. Muratori (Muratori 1960), M.P. Conzen (Conzen 1978) and the

school of Versailles with the architect Jean Castex, the urbanist Philippe

Panerai and the sociologist Charles Depaule.

From Patricios (2002), the hierarchical structure of the city. (a) Enclave, (b) block, (c)superblock, (d) neighbourhood.

Figure 5: The hierarchy of components in the urban fabric.

The first element of typo-morphology is the open space of the city, the

non-built one (the road, the park, the square, etc.). The second element is

the building with its geometric properties. These two elements combine in

different types of urban landscapes. The link between the two is the parcel

or plot, which is the basic element of the urban fabric. The urban fabric is

the set of properties that define these elements and their relationship with

each other. The urban fabric links the building scale and the city scale,

through a hierarchical aggregation of the basic units into enclaves, blocks,

super–blocks and neighbourhoods. These elements and their layout describe

the morphology (form) of the city.

On the other hand, the typology is defined by the buildings. A schematic

view of the building typologies is given by L. Martin and L. March in their

seminal work ”Urban space and structure”(Martin & March 1972). They

define three broad types of built form (figure 6, p.18) that differ in the shape

of the building and its relationship with the plot. The pavilion represent an

isolated building with a small footprint that achieves high density with a

higher number of storeys. The court is the type that, given the constrain of

access to natural ventilation and to daylight, achieves the highest coverage of

the plot. The street is a typology in between. These three categories have

been the basis of several studies of urban typo–morphology and of other

disciplines (see as examples Alexander (1993) on density and Ratti et al.

(2005) on the environmental performances of different typologies).

The three dispositions are described respectively as: the pavilion or tower, the street orslab, the generating cruciform in a continuous pattern of courts.

Figure 6: The hierarchy of components in the urban fabric.

2.3 Urban science and generative design

The other important factor that sets the context of this research is the re-

cently revived focus on a quantitative approach to urban planning and to

architectural design. In general, the last decade saw a spike in the computa-

tional power cheaply available, thanks to improved chips but also to Internet

and clouding services. There has been also a flood of available data coming

from mobile devices that turned the attention to the possibilities offered

by data analysis techniques. On the side of urban planning, for example,

Moudon & Lee (2009) call for an “Urbanism with numbers” and produce a

case study where socio-behavioural data can be linked with built environ-

ment parameters to make a case for the use of numbers in urban planning.

Another example is the space syntax, a set of theories and methods for

the analysis of spatial configurations, that has been around since the 70’s

(Hillier 2007) and recently found new vigour with the spike in computational

capacity of computers. In the architectural theory, the last decade saw the

flourishing of parametric design (Schumacher 2009), developed initially by

the architect Zaha Hadid and academics at the Architectural Association

in London. Parametric or generative design is a design process that re-

lies on algorithms to generate geometries from the definition of a family of

initial parameters and the formal relations they keep with each other. It

is a design process that deals with the parameters that generate the form

rather than the form itself. It was originally developed in aerospace and

automotive industries, where the focus is very much on the performance.

In the professional practice, architectural offices together with all the other

agents involved in the planning, design and management of buildings and

infrastructures use a rather new tool for delivering and sharing information:

Building Information Modelling (BIM) are database systems to store and

share digital representations of buildings (Azhar 2011), which include the

geometry as well as materials, technical specifications, etc. all in one single

model.

If urban morphology is to be of support to urban planning and urban design,

there is a need to link different types of city form to their socio-economic

performance and to wellbeing statistics: this means that there is a need to

quantify urban forms in a way that can be scaled up in order to study large

systems.

2.4 Urban density

The debate about the urban form saw an intensification in the last decades

also in response to the uncontrolled urban expansion and the contempo-

rary realisation of the environmental impact of urban life-style. Since the

phenomenon of urban sprawl has been recognised and defined (Real Es-

tate Research Corporation 1974, Burchell et al. 1998, Jackson 1985, Ewing

1997), several studies have been produced that try to understand the rela-

tion between urban form on one side and socio-economic and environmental

parameters on the other. One constant of this debate is the use of density

as an all encompassing summary for the urban form.

Density concerns the urban form because it is a first, coarse approximation

commonly used for its simplicity as a prescription in the planning process,

mostly as a useful measure to describe the load on services and infrastruc-

tures but also because of its supposed relationship with the urban land-

scape. Churchman (1999), in her review of the use of urban density, found

that many disciplines make use of the concept. These include “planning,

urban design, architecture, environment-behavioral studies, transportation,

economics, sociology, psychology, anthropology, and ecology”.

Alexander (1993) identifies four main fields where urban density is exten-

sively used:

Psychological and Behavioural Studies Mainly concerned with the perception of crowd-

ing, privacy, territoriality and so on (see Krupat (1985) for a literature review).

Land and Urban Economics How density affects the efficiency of the city in terms of

transportation, land consumption, business, etc. For a brief description, see

Section ??

Planning Normative regulations involving density to control urban development.

Density and Urban Form In urban morphology several studies address the question

of what is the relation between density and urban form.

In land and urban economics, for example, transportation is one the most

discussed. On the economics of scale for transportation systems, Priest

(1977), Frank (1989), Burchell & Listokin (1995) link higher density to less

expensive infrastructures. CENTRO (2012) explore the impact of density

on the use of public transportation suggesting that compact forms reduce

the need for traveling. Cervero & Guerra (2011) analyse the density thresh-

olds required to make urban transport economically viable. On the use of

private transportation, Newman and Kenworthy have an extensive body of

work linking higher density with lower gasoline consumption (Newman &

Kenworthy 1989a,b, Newman 1992, ECOTEC 1993), but their conclusions

remain controversial (Gordon & Richardson 1997, Stretton 1996). Melia

et al. (2011) suggest that, although compact cities incentivise the use of

public transport, they are also prone to higher traffic congestion.

In terms of environment, there is unanimous consensus on the fact that

high densities reduce the need for agricultural and wild land (Burton &

Matson 1996, Alterman 1997), although land protection has been criticised

by Gordon & Richardson (1997) on the base of the market economy, be-

cause restrictions on development force land into lower valued uses. They

also argued that land at the moment is not scarce in the U.S. and in the

world and that food production has been steadily on the rise in the past

decades, therefore, at the moment, there is no need for restricting land use.

Overall, the current prevalent view is that a certain level of population den-

sity is required for various services to be cost effective and this is reflected

in the work of several institutions (Commission of the European Communi-

ties 1990, United Nations 1992, National Research Council 1999, American

Planning Association 1999, EEA 1999) and in various planning guidances

across the world (Greater London Authority 2006, UK Government 1994,

OECD 2012).

2.4.1 Definitions of physical density measures

Density is a point measure of a quantity normalised by some area or volume

it occupies. Within the urban context, density is used to describe rela-

tionship between the surface of an aerial unit A, such as a city block, a

neighbourhood or a whole city, and a quantity q such as building floor area,

population or residential units. Mathematically, it is simply described as:

When comparing different measures of density, the nominators can be con-

verted between each other sometimes explicitly, sometimes making assump-

tions (e.g. building floor area can be converted to residential units assuming

an average size of the dwellings). Depending on the assumptions, the con-

version can be more or less accurate.

The denominator may varies in two respects: the unit of measurement and

the definition of boundaries. The first case is just a matter of arithmetical

conversion. The second case, also known as modifiable areal unit problem,

abbreviated as MUAP (section 2.4.3, p.26), can cause a great deal of am-

biguity and lead to incomparability in the results of different studies. It is

itself a subfield of study related to various disciplines (Dark & Bram 2007).

Density measures can be categorised by their nominator (Berghauser Pont

& Haupt 2010):

Population and dwelling density The first one is expressed as the number of people

leaving in an area, the second one as the number

of dwellings units. As social transformations are

usually faster than transformations to the building

environment, population density is subject to a

higher variability across time. They are both used

to plan for services and infrastructures.

Land Use Intensity Measures the ratio between the total floor space

(on all floors) in an area and the total surface of

the area. It is known as Floor Space Index (FSI) in

Europe and Floor Area Ratio (FAR) in the U.S.A.

As we can see from the definition, land use inten-

sity does not make use of population and it in-

cludes all sort of land use, not residential only.

For this reason it gives a better estimation of the

urban form and it is more suitable at describing

mixed-use areas then the previous methods.

Coverage Or Ground Space Index (GSI) is the ratio between

the area of the footprint of the buildings and the

area of the site.

Height Building height, expressed in total number of floors

or length units. Although not a proper measure

of density, building height affects the way the en-

vironment is perceived and it is also related to

cultural factors (in UK, more than in other coun-

tries, strong is the stereotype of home as as ter-

raced house two storey high).

Spaciousness Or Open Space Ration (OSR) is the ratio between

open space area and total floor surface.

A further categorisation can be made depending on the denominator (Alexan-

der 1993):

Net Dwelling Density (NDD) The nominator may be any quantity of population

or residential units suggested above. The denomi-

nator is ”the total land area devoted to residential

facilities”. This includes any private amenities area,

parking and access driveways but excluded are any

commercial activities, parking and local businesses

not directly below the dwelling structure, public

parks, institutions, schools and public streets.

Gross Residential Area (GRD) The nominator is the same as above, the denomi-

nator is ”the gross residential site area”. This in-

cludes ”the net residential area + half the area of

the perimeter roads + one quarter of the area of the

intersections”.

Neighbourhood Density (ND) Tthe number of population, dwelling units, etc. per

unit of area of the total neighbourhood land. It

includes all the major services related to the neigh-

bourhood but it excludes services and institutions

that serve the whole city and above.

City Density (CD) This is the ratio between a chosen quantity as above

and the area of the whole city. One of the main

source of ambiguity that can affect comparability

across cities and the interpretation of the index is

the definition of the boundaries. It can be defined

as the administrative boundaries but these are often

arbitrary and can include rural and semi-rural areas.

2.4.2 Perceive density

Alexander (1993) extends the concept of density to qualitative attributes in

order to describe the perceived density, that is how the built environment

is perceived by the users. The rationale behind this extend definition of

density is that, ultimately, the goal of disciplines such as urban planning,

architecture and urban morphology is to shape, analyse and predict the built

environment so that it is at the same time functional to all the urban activ-

ities (e.g. commerce, services, transport, leisure) and perceived as pleasant

and enjoyable by the users. Therefore it is necessary to quantify not only

physical measures but also qualitative attributes

Figure 7: Perceived density. Contributing factors Alexander (1993).

Perceived density is a definition that tries to catch the physical element

of the building environment as well as the way this affects the citizens. Per-

ceived density results from the interaction of three factors (figure 7, p.25):

physical density, individual cognitive factors and socio-cultural ones. In-

dividual factors are hard to measure on a large scale and can hardly be

incorporated in a quantitative descriptor that aims at summarising charac-

teristic of a city. Socio-cultural factors can incorporate norms and standards

of a specific location and depends on the “homogeneity or heterogeneity of

the users of the environment, presence or absence of socio-culturally regu-

lated norms of interaction, levels of social interaction and the character of

activities in relevant setting”.

Physical density itself contains measured density and “qualitative density”.

The latter includes “those aspects of physical density that cannot be mea-

sured and is generated by other relevant physical factors such as design

diversity, scale, etc.” (Alexander 1993). Perceived density, therefore, can

constitute the broader framework where a classification of the building envi-

ronment based on physical density could be linked to all the other parameters

that affect the perception and the fruition of the built environment.

2.4.3 The problem with density measures

In view of this widespread use of density, it is certainly surprising the lack of

a common definitions. Churchman (1999), in her overview on the subject,

found no shared definition across studies even within the same discipline.

It is also surprising that some of these studies, which have been the basis

of years of debate, have paid little attention in the definition of one of the

key parameters, up to the point that seminal works such as ”The cost of

sprawl” (Real Estate Research Corporation 1974, Burchell et al. 1998) have

been criticised for the loose use of density by Windsor (1979), who suggested

that the results are actually the opposite.

The two main problems affecting density measures are the previously men-

tioned MUAP and the fact that density is an average measure. The first

problem poses serious question on the validity of various results, because

depending on the definition of the boundaries, results can differ wildly.

Boundaries vary also over time and this makes comparing results across

long time spans difficult. A recent attempt which addresses the problem of

arbitrariness of the boundaries and their variation in time has been done in

Arcaute et al. (2013), by defining an universal law for the city boundaries

using the properties of the road network. This approach has proven valid in

dynamically defining the boundaries of whole cities. Another similar way of

addressing the MUAP problem for the aerial unit in density measures has

been tested in Pont & Marcus (2014) by using a type of location measure.

Location measures define the boundaries by setting the position of a loca-

tion and then calculating the amount of accessibility to a parameter of choice

(this can be the number of services, number of jobs, degree of a network,

etc.). The process can be reversed by setting the accessibility threshold and

calculating the radius necessary to achieve that accessibility. The result is

a dynamic aerial unit for each part of the city. The second problem is in-

extricably connected with the need for simplification: an all-encompassing

index that can at one glance describe a complex landscape is very appealing

but the price is the loss of detail in favour of averages. Other problems are

related to the temporal dimension: cities shows huge variations in popula-

tion, and therefore in density, across different times of the day, due to people

commuting to work. Being density a fixed index, it completely misses these

type of dynamics which are crucial to the understanding of the city.

Figure 8: Three blocks with the same density of 75 dwellings units per hectare (Per

In urban morphology, one main issue is that a mere measure of physical

density is not enough to encompass all the different variations and possibil-

ities. The same single value of a density measure can be obtained with very

different building configurations. Figure 8 (p.28) illustrates the concept by

presenting three residential schemes that achieve the same dwelling density

one with a neighbourhood of terraced houses, one with apartment blocks

around a courtyard and another with a tower within a park and a parking

2.5 Multi-dimensional approaches to urban form

In their book ”Space, density and urban form” Berghauser Pont & Haupt

(2010) compare the different methods to measure density as defined in Sec-

tion 2.4.1 and analyse how they perform in describing the urban form. They

found that population or dwelling density are poor performers and, although

FSI is a better indicator, none of the methods above maintain an univocal

relation between a type of form and for the same value there are always sev-

eral associations. They propose the use of a multi-variable density concept

composed of measures of physical density of the build environment in order

to be able to classify different urban typologies. The variables included in

the index are: Floor Surface Index (FSI), Ground Space Index. Martin &

March (1972) are amongst the first at trying to quantify the geometrical and

relational characteristics of different the spatial layouts for the purpose of

classification. Another example is the systematic gathering and analysis of

a set of parameters of several neighbourhoods in Switzerland by the CETAT

(1986). Amongst the parameters, they collect FSI, GSI, volumes, ratios of

sideways, pedestrian areas, parking, green areas, etc. More recently Gil et al.

(2012) presented a systematic and mutli-dimensional method of description

and classification based on typo-morphology, that makes use of data mining

techniques. The type of parameters used are similar to the Swiss study, to

which they add street network centrality measures degree, closeness and

betweensess and block orientation. These parameters are then classified

using k −means analysis to obtain six classes of blocks and four of streets.

3 Methodology

3.1 Conceptual framework

The rationale behind the present analysis is to explore an effective way

of using parameters that can be derived from the geometry of the city,

hence available as GIS databases, in order to understand the urban typo–

morphology. Using as a starting point the work done by Berghauser Pont

& Haupt (2005) on their space-matrix, I aim at defining the geometrical

properties of the buildings and their relationship with the adjacent environ-

ment that are functional to a typological classification of neighbourhoods.

The key questions that this study tries to address are what are the differ-

ent types of neighbourhoods, what are their defining geometrical features

and the building density, that is FSI, that they deliver. At the end of this

study, I also analyse the relation between neighbourhood typologies with

income and population density. As described in detail in section 3.2, I use

the Ground Space Index (GSI) and the building height (L) to assign a class

to each areal unit (defined below) across the territory of Greater London.

The classes are manually defined by setting the thresholds along the GSI

and the number of storeys: the classes are derived by the combination of

three intervals on GSI and four intervals on L, for a total of twelve classes.

Figure 9: Distribution of building count aggregated by block.

Four different classifications are proposed, each differing in the definition

of the areal unit. The first method, referred to as Bl, uses the city block,

defined as the smallest area enclosed by a loop of roads. A second method

Pl uses the cadastral parcel, or plot. A third way, Bl400 and Pl150, consists

in using a range r to aggregate the parameters. As shown in figure 9 for

method Bl, areal units often contain more than one buildings, therefore the

parameters are calculated by aggregating the buildings contained in each

unit. The advantage of this aggregation is that the character of an area is

given by the sum of its components, rather than the single element. On

the other hand, as always when generalising, some important details might

get lost. The use of plots in method Pl achieves a more granular approach

because plots generally contain just one or very few buildings. The third

method helps overcome a problem that affects both the previous ones. The

experience of the environment perceived by the user moving across the city

is defined by the buildings along the street. It should be therefore an average

of the characteristics of the two sides. Both the first two methods look at

the two sides of the roads separately, whereas using a range to aggregate

the parameters allows to take into account area averages that include both

sides. To summarise, the four methods of aggregation will be referred to as

below:

Bl aggregation by street block.

Pl aggregation by cadastral plot.

Bl400 aggregation by street block in a range. The length of 400 m has

been chosen as the standard walkable catchment for the day-today

facilities, used in literature.

Pl150 aggregation by cadastral in a range of 150 m In this case the length

has been chosen by a trial and error refinement to find the right

balance between aggregation and detail.

In Berghauser Pont & Haupt (2005) the approach has been to select few

different type of neighbourhoods, already clearly defined in terms of typol-

ogy and then analyse where they fit within the Spacematrix and how they

cluster. My approach is similar: I first argue about the choice of parameters

for the classification, then define the threshold for each class and run the

classification, and then verify what typology is identified by each class. In

general, in the following description, I will be using method Bl and refer to

the results of the others when these are relevant.

3.2 Definitions of parameters

As described in section 2.4.3, a mono-dimensional index such as density is

not enough to distinguish different urban typologies from each other, as

the same density value can be associated to very different spatial layouts.

Berghauser Pont & Haupt (2005) propose the use of a multi-dimensional

index that takes into account other geometric properties, such as the ratio

of the built area of a plot, its ratio of total floor space built, etc.. Below

I list of the basic components of these properties as they come from the

data source (described in section 3.4, p.41), the definition of the principal

measures used for the classification and the assumptions made in the present

study:

u Areal unit. It is defined, depending on the method, as the street

block (method Bl), the cadastral parcel or plot (method Pl) and

the circumference of the circle of range r (methods Bl400 and

Pl150).

L Number of storeys of a building. In the present study, as the

available OS database gives the building height H in meters, I

assume the average floor to floor height in London to be H = 3m

and therefore:

My assumption, of course, is questionable but I was not be able

to find any study related to average floor heights in London. In

my experience, 3m is a good approximation of the standard floor

to floor height in the construction industry, but in practices this

can vary drastically: from the 2.4m standard of new residential

buildings and old working-classes terraced houses, to 3-3.2m of

wealthier period houses, up to 4 or 5m of ground floors of large

commercial premises or office buildings.

Lw Because most of the areal units, let them be blocks, plots or

ranges, include several buildings, the value L of the areal unit

is the weighted average of the buildings using the total floor area

AT as weight. Therefore, the weighted average of the number of

storeys of an areal unit with n buildings can be written as:

Lwu =n∑

LiATi /

n∑i=1

Au Surface of the areal unit. In the case of blocks and plots, it is the

area of the geometric shape. In the case of the range it is the area

of the circumference Au = πr2, with r = 400 m for method Bl400

and r = 150 m for method Pl150.

Af Area of the footprint of a building. In the OS data, the footprint

of the building is the building shape in the Topographical Layer.

AT Total floor area of a building. It is calculated by multiplying the

footprint area by the number of floors:

AT = AfL

GSI Ground Floor Index. It represents the percentage of built area of

the areal unit u and is calculated by adding up the footprint area

of each of the n buildings within the areal unit and dividing it by

the area of the latter:

GSIu =n∑

Afi /A

FSI Floor Space Index. It represents the percentage of total floor area

compared to the size of the areal unit. It is calculated as the

sum of the footprints times the number of floors of each of the n

buildings within the areal unit, divided by the area of the latter:

FSIu =n∑

Afi Li/A

OSR Open Space Ration or spaciousness. It is a measure of the amount

of non–built space at ground level per square metre of gross floor

OSRu =1 −GSIu

The space-matrix (figure 10) is the visual aid to the classification used in

Berghauser Pont & Haupt (2005) and Berghauser Pont & Haupt (2010). On

the two axis x and y are represented respectively GSI and FSI. OSR and

L are gradients that complete the visual information but for the purpose of

the classification and therefore the placement of an area within the graph,

are superfluous. In fact, FSI is also redundant and can be replaced with L.

If we look at figure 11 (p.36), GSI and FSI are strictly correlated.

Figure 10: Space matrix diagram.

This is by definition, because FSI in the case of a single building on an

areal unit, for example, is equal to GSI multiplied by L. The fact that in

this context FSI and L are interchangeable makes their use dependent on

the type of dataset available and on the context in which the information

is used, i.e. to whom the results are aimed. In some cases, FSI might be a

known quantity as part of the records kept by the planning authority. In

other contexts, such as the one of this study, it is easier to access the building

height database. Because the building height is part of the raw data , for

the purpose of this study, I will be using Lw, the weighted average number

of storeys of an aerial unit, instead of FSI.

Figure 11: Correlation plot between FSI and GSI.

3.3 Definition of classes

The left plot in figure 12 (p.37) refers to the result of the aggregation Bl and

shows the GSI values by block against the weighted average of the number

of storeys, where on the right hand side the same values are overlapped with

the thresholds of the classes, which are used for the classification of the data

that come from each of the four aggregation methods.

Figure 12: GSI v number of floors and class definition.

The grid divides the space in the twelve cells that represent twelve classes

of urban typologies. The bottom-left cell represents 1-2 storey high buildings

that are scattered in an open land, a sort of rural context. The top-left one

has the same dispersion but very tall buildings, indicating probably residen-

tial towers surrounded by green area. The top-right, instead, are probably

office towers in a highly exploited area, such as the City, for instance. Just

below, in the medium-high - high coverage slot, we can probably expect a

Paris style neighbourhood, with dense blocks and relatively high buildings.

Figure 13: GSI distribution of blocks.

The definition of the thresholds is crucial to the classification work but

also debatable. Urban typologies have a strong subjective component, can

vary amongst different subjects and cultures and their definition is subject

to a certain degree of arbitrariness. For the definition of the building height

thresholds (figure 12, right graph), I rely on a mix of facts and observation.

The first class is up to 2 storeys high, as this is a clear feature of most of

the residential units in the UK (Muthesius 1982). The second class, up 6

storeys, corresponds to the upper limit of medium size office buildings and

residential estates. Buildings up to 12 storeys high correspond to residential

towers or high density offices or apartment blocks. Above 12, we are in

the realm of towers. The building height thresholds fall half way between

the integer value marks because they are weighted averages, so in order to

identify the typology up to two storeys high, for example, the threshold is

set to 2.5.

Figure 14: Count of the classes by GSI thresholds.

For the GSI thresholds, I tested two methods (figure 13, p.38): the one

in blue is based on three equal length intervals within [0:1], the one in red

is based three equal quantiles (0.33 and 0.66). The resulting classification

of the two options can be observed in figure 14: the division in equal length

intervals returns a very high number of low-rise - low coverage with little

variation in the low-rise category, whereas the division in quantiles produces

a much richer and diverse classification and therefore it is the one selected

for the rest of the study. Table 1 summarises the brackets of the twelve

classes.

Table 1: Class definition summary.

0 <GSI <0.18 1.8 <GSI <0.28 0.28 <GSI <1

1-2 storeyslow rise

low coverage

low rise

mid coverage

low rise

high coverage

3-6 storeysmid-low rise

low coverage

mid-low rise

mid coverage

mid-low rise

high coverage

7-12 storeysmid-high rise

low coverage

mid-high rise

mid coverage

mid-high rise

high coverage

above 12high rise

low coverage

high rise

mid coverage

high rise

high coverage

3.4 Data sources

Most of the data comes from the Ordnance Survey (OS) accessed through the

website of Edina, digimap.edina.ac.uk. OS is the national mapping agency

for Great Britain and Edina is a UK-based data provider for educational

purposes aimed at educational staff and students. From their website, it is

possible for students and academics to access the the OS Master Map, OS’s

most detailed products. The map has different layers, corresponding to dif-

ferent types of information. The IntegratedTransportTMNetworkLayer,

contains the map of the entire road network of England, Wales and Scotland

and have been used to calculate the polygons of the street blocks (figure 15,

p.42). The Topographic Layer provides the detailed shapes of the footprint

of the buildings and this is integrated with the Building Heights Layer.

The right diagram represents a street network, the left one its negative,the street blocks. They can also be described as the void (the street)and the volume (the built environment), although this description tendsto be less meaningful outside the very dense city centres.

Figure 15: Street network and street blocks.

The database INSPIRE Index Polygons provides the shape of the

cadastral plots and the files of each borough are available on the web-

site https://www.gov.uk/government/collections/download-inspire-

index-polygons. From the same source, I also use the Network Rail data,

which contains the location of each railway station in the UK and the centre

line of the rail tracks. For the definition of the boundaries of Greater Lon-

don, the statistical unit areas and the related data, I use the combined data

of the Office for National Statistics and the OS, as provided by the London

Datastore at data.london.gov.uk.

3.5 Data storage

The data was stored in PostgreSQL and manipulated in PostGIS. Post-

greSQL is an open-source database supporting SQL constructs and PostGIS

is a powerful geometric and geographic extension that provides PostgreSQL

with the possibility of storing geometry and perform spatial operations. Al-

though the same process of storing and manipulating could have been done

using just ESRI shape files, the advantage of the database environment is

that the same data can be accessed by different environments via standard-

ised SQL queries and used in web visualisation or other applications. It is

also possible, in case of lack of local resources, to transfer the storage and

the processing to clouding services making the whole process flexible and

easily scalable. Another important tool used is GDAL, a translation library

for vector and raster geospatial data formats, and in particular its function

ogr2ogr, used to import gml files. To import ESRI shape files, I use shp2pgsql

from the command line, which is the PostGIS standard tool for importing

this type of files.

The data is imported from raw gml and csv files or from ESRI files, using

Python scripts to control a mix of terminal commands and SQL queries.

The main workflow consistes in using Psycopg2 and SQLalchemy, two Post-

greSQL adaptors for Python, to create a table in the database and then the

module Subprocess, which allows Python to run terminal commands, to run

ogr2ogr to copy the gml into the table. The import process is executed in

the following steps, each corresponding to a Python script (appendix 6.4.1,

p.99 for the code):

1. The OS Topographic Layer is available in cells of 10x10km, each

containing several smaller cells. Data import TopographicLayer.py

loops through the folders, fetches the name of the files and

builds a command statement to be run in terminal. SQL state-

ments then add primary keys and geometric indexes to the ta-

bles. The created table contains the geometry of the footprint

of the buildings and a building ID that can be connected to the

BuildingHeight database. Data import InspireCadastralPlots.py and

Data import OS ITN.py work with the same principles to import re-

spectively the cadastral parcels and the road network.

2. Data import BuildingHeights.py fetches the folder path for each csv,

creates a database table and runs an SQL statement to import the

files in the table.

3. Data import LondonAdministrativeBoundaries.py uses the Phython

module Subprocess to run shp2pgsql in the command line to import

the ESRI shape files of the administrative boundaries.

3.6 Data cleaning and manipulation

A last step in creating the necessary geometry is to calculate the polygons

from the road network geometry. For this purpose, I use the topological

extension of PostgreSQL. In the representation of the morphology of the city

described above, made of streets and blocks, we can use lines to represent

the streets. But once the road network is laid out, creating the polygons is

a redundant exercise, because the information is already there. Topological

representation takes into account the fact the geometric features rarely exist

independently of each other and keeps track not only of the geometry of

the elements, but also of their relationship. For any given geometry, the

topological extension creates four tables to store nodes, edges, faces and their

relationships. Below I describe the steps of the cleaning and manipulation

process (appendix 6.4.2, p.113 for the code):

4. Data createTopology.sql creates the topology tables from the ITN

street network. The import process is broken down by borough be-

cause it is computationally demanding and it was easier to control

eventual errors. For each borough, the network is intersected with the

administrative boundaries and then the topology is created. When

a new borough is added, Postgres topology automatically merges its

geometry with the one already imported by removing duplicate edges

and nodes. Once the geometry has been converted in topological ta-

bles, the faces are extracted looping through the ID of the face table

with the function ST GetFaceGeometry.

5. Data JoinBuildingHeights.sql merges the building footprint shapes im-

ported from the OS Mastermap with the relative building heights ta-

6. Data createBuildingShapesTable.sql creates the table shapes, which is

the main table that stores buildings’ geometry and data.

7. Data joinBlocks.sql, Data joinPlots.sql, Data joinBlockRange400.sql

and Data joinPlotRange400.sql effectuate a spatial join between the

areal units and the centroids of the buildings to count the buildings

falling within the area and calculate aggregate measures and weighted

averages. In particular, they calculate the count of buildings, the

total footprint, the total floor surface, the GSI and its standard varia-

tion, the FSI and its standard deviation, the average number of floors

of the buildings weighted with their total floor surface and the stan-

dard deviation. This process creates four tables block index, plot index,

block400 index and plot150 index with the areal units and relative at-

tributes of methods Bl, Pl, Bl400 and Pl400.

The four tables contain the attributes for each areal unit. The following

Python files use GSI and the weighted number of storeys from these tables

to classify the elements based on the classes defined in section 3.2 (p.32):

9. Data classification blocks.py and Data classification plots.py loop

through the areal units to compare GSI and number of storeys and

assign a class of urban landscape to each unit.

10. Data joinIncomeTable.sql and Data classification income.py performs

the join between the LSOA boundaries and the classification tables in

order to be able to relate different urban landscapes to income and

population data.

4 Analysis

4.1 The scale of aggregation

Figure 16: Classification method Bl: London overview.

An overview of the results is shown in section 6.2 (p.79) and a detailed vi-

sualisation of the different case studies can be found in section 6.3 (p. 83).

The four overview maps show the classification over the whole Greater Lon-

don administrative boundaries: method Bl in figure 37 (p.79), also shown

in figure 16, method Bl400 in figure 38 (p.80), method Pl in figure 39 (p.81)

and method Pl150 in figure 40 (p.82). In the detailed maps, for each case

study a satellite view of the area is provided, together with maps showing

the building height and the classification by different types of aggregation.

The different types of aggregation clearly convey different scale of infor-

Figure 17: Classification method Bl: close-up on the city centre.

mation. Pl gives a very detailed level of information that can be used to

describe single buildings but needs aggregation to a larger level to be able

to describe an area. At the global scale, although the red colours are visi-

ble and therefore it is possible to spot areas with taller buildings, it is not

possible to distinguish between different categories. Bl400 averages far too

much for a detailed view but at the global scale returns a very immediate

overview of the distribution of few clusters of different typologies within a

vast majority of low-rise buildings. There is scarce detail, though, for ex-

ample the whole borough of Islington is composed by just two classes. Bl

and Pl150 return detailed enough information for the close reading of the

case studies, maintaining their readability also at a larger scale.

The overview in figure 16 (p.47) shows quite clearly the hierarchy of the

morphology of the city: a well defined core of high density environment

between the City of London and Westminster, with a ring of medium-high

density around, a progressive reduction of the density towards the fringes of

the city, punctuated by clusters of mid-high densities. The core alternates

mid-high rise buildings with areas dominated by towers such as south of

Liverpool Station, Old Street, around the Museum of London and the north

end of Tottenham Court Road (figure 17, p.48). Outside the core, the other

high density areas are Canary Wharf, with a very high concentration of tow-

ers, Stratford, recently developed to high densities, part of the Southbank

and Nine Elms. Outside this second ring, we can still find some clusters of

medium-density such as Richmond, Wembley, East Croydon, Bromley, but

these are rather small compared to the vast areas of low rise - low coverage

around, which make up most of the peripheral boroughs.

4.2 Case studies

Figure 18: Case study locations.

The areas taken as case studies (figure 18) have been selected trying to in-

clude different possible scenarios: Angel is a central residential area with an

active high street, Bank a prime central location for offices, Emerson Park

is a suburb-style residential neighbourhood in the outskirt, East Croydon

a dense core around a transportation node and Swiss Cottage is a residen-

tial scheme designed on modernist principle of dispersed medium and high

buildings. For detailed satellite views and maps, see appendix 6.3 (p.83)

4.2.1 Case study I - Angel

Credits Microsoft Bing Maps 2016.

Figure 19: Bird’s eye view of Angel.

Angel station, in the borough of Islington, sits in an area at the junction

between three important roads: Upper Street coming from Highbury &

Islington, Pentonville Road coming from Kings Cross and City Road coming

from Old Street. Upper Street is a commercial street, where Pentonville

Road has commerce on just the north side and City Road has residential

and offices only. It is a centrally located, commercial strip surrounded by

residential areas. As we can see in figure 19, the two sides of Upper Street,

next to the junction, are compact and densely built, while the south side

of the junction and beyond those compact blocks, is primarily composed of

small residential buildings. The aggregation by blockBl returns a reasonable

picture of the typology of the environment as it is able to separate the

compact areas (mid-low rise - high coverage) along three main commercial

axis from the more residential areas (mid-low rise - low coverage). In the

bottom-left corner, it discerns between two high-rise towers, one occupying

most of the plot and the other set in a little park. The finer classification

of method Pl reveals three mid-high rise - high coverage at the junction

typical of highly exploited plots where commerce activities and offices are

concentrated. The classification does not distinguish between old terrace

houses and relatively recent mid-rise council blocks. This, of course, is due

to the fact that they share the same height and the same coverage, although

the character of the neighbourhood they define and the experience at the

pedestrian level they produce are quite different. The first one faces the

road, has single access to residential units with gardens in the backyard.

The second one is often composed of long bars built with wide setbacks.

4.2.2 Case study II - Bank

Figure 20: Bird’s eye view of Bank.

Bank is part of the Square Mile, the old part of the city, in the borough of

the City of London. It is a prime business district, with few residents and

a primary function of high-end offices. Because the land is very valuable,

it is one of the most densely built areas, with buildings mostly 7-12 storeys

high, but with the smallest population density (figure 35, p.77).

The classification identifies the area as mid-high rise - medium coverage and

mid-high rise - high coverage with the peak of density south of Liverpool

Station where there is a high concentration of towers high-rise - medium

coverage and high-rise - high coverage. This particular area is a very lively

neighbourhood, rich of high-end offices and activities that serve them. It

also makes for a completely deserted place during the weekends, when offices

are closed. This is an important feature that defines the character of the

area and it is not, of course, detected by the classification, that is purely

based on geometric features. This proves to be an important limitation of

this type of classifications.

4.2.3 Case study III - East Croydon

Figure 21: Bird’s eye view of East Croydon.

East Croydon is part of the borough of Croydon, in the south of the city.

It is a mid-high density cluster around a transportation node, within an

area that is dominated by low-rise residential typologies. If we look at the

Bl aggregation, two areas of mid-high density sit on the two sides of the

railway, with mid-low rise - high coverage blocks around, which sharply

change to the low-rise residential areas of terraced houses. If we look at the

results around the tower, east of the railway, for method Pl150 the mid-high

rise class is extended to small plots of two storey buildings, confirming the

intuition that rather than giving an exact result, the aggregation by range

can be interpreted as the perception of the character of the area from the

street.

4.2.4 Case study IV - Emerson Park

Figure 22: Bird’s eye view of Emerson Park.

Emerson Park is an area in the borough of Havering, at the eastern limit

of Greater London. It is a uniform residential neighbourhood composed of

solely single houses or semi-detached houses with large private gardens and

green public areas. The whole area falls consistently in the lower category

of density low rise - low coverage, although at the plot level Pl we can see

minor variations up to mid-low rise.

4.2.5 Case study V - Swiss Cottage

Figure 23: Bird’s eye view of Swiss Cottage.

The area considered is next to Swiss Cottage station, along Adelaide Road.

It is predominantly residential with large green surfaces and without com-

mercial high street. The building typology is for the vast majority composed

of single apartment buildings between 3 and 6 storeys high with setbacks.

Most of the plots fall into the mid-low rise - medium coverage class, with the

relevant exception of the central block where there are 5 tower blocks above

12 storeys high. If we compare this typology with the residential blocks west

of Angel, which are composed primarily of terraced houses and private gar-

dens, overall the spatial configuration of Adelaide Road deliverers less floor

space per unit of land than the traditional terraced house neighbourhoods.

4.2.6 Case study VI - Other observations

Although the present classification returns a realistic picture of the built

environment, some limitations of the methodology emerge from detailed

observation. First of all, how relevant or functional the classes are depends

on the scope of the classification, on the audience the results are aimed at and

on the cultural context. This would affect the choice of the thresholds and

also the number of categories. For example, in order to avoid a multiplication

of classes, in this study, GSI was divided in three bands with the result

that completely rural areas are not differentiated from suburban residential

neighbourhoods.

The same problem of having different spatial layouts achieving the same

dwelling density holds with a two-dimensional index, where the combinations

are greater but still limited. In figure 24 (p.57), the area around Wembley

Stadium is classified as mid-low rise - high coverage, suggesting a dense

urban environment, similar to Angel, whereas it is an area of big warehouses

and car parks. The same problem is visible in Enfield, shown in figure 25

(p.57). If we look at figure 26 (p.58), the area between Angel and King’s

Cross poses another challenge: the zone marked with A and the one with B

share the same class but they are clearly different types of urban landscapes:

A is composed on long blocks recently built while B is mainly composed of

Victorian terraced houses. These problems could be addressed by adding

more geometrical features to the classification, such as the compactness of

the building, footprint shapes or setbacks.

Figure 24: Classification method Bl: close-up on the Wembley area.

Figure 25: Classification method Bl: close-up on the Enfield area.

Figure 26: Classification method Bl: close-up on the area west of Angel.

4.3 Classification summary

Figure 27: Average FSI by classes of urban typology.

FSI describes the level of exploitation of the land in terms of quantity of floor

space by unit of land. Figure 27 (p.59) shows the average of FSI by classes.

Besides the class high rise - high coverage, which corresponds to high towers

in a very dense configuration, the highest density of built environment is

obtained in the mid-high rise - high coverage class, which corresponds to

a Paris-like configuration, where blocks are fully built with six to twelve

storeys high buildings. The level of coverage, GSI, is crucial in delivering

high density neighbourhoods.

Figure 28: Summary of the income by classes of urban typology.

As it is shown in the graph, the high-coverage categories are always

denser than the low coverage class of the next higher category, indicating

that denser areas area obtained by mid-high rise building in a compact

configuration, rather than high rise towers in a sparse layout. This also

confirms the observation about terraced houses and modern buildings in

section 4.2.6 (p.56).

If we look at the count of classes across Greater London (figure 29, p. 61),

there is a striking prevalence of low buildings. As it can be seen also in

figure 16 (p.47), the prevalence of green areas is quite clear, but mostly

concentrated in the outer ring. Amongst these areas, there is a very high

number of low coverage areas, which correspond to an almost or fully rural

environment. Although this is a common trait of most of the cities, in this

case it is likely a consequence of the Green Belt policy (Thomas 1963), which

greatly limits the expansion areas of the fringe of the city.

Figure 29: Blocks counted by class of urban typology.

4.4 Classes of neighbourhoods and income distribution

Figure 28 (p.60) shows the median income by classes. There is a progression

from lower to higher income in the classes and if aggregated by coverage

(the different tones of colour in the graph), it shows that higher income

classes have a slight preference for low-coverage areas. These have an aver-

age income of £41400, whereas medium-coverage classes about £38600 and

high-coverage classes about £39000. The differences are small but they are

all significant with a t-test confidence level of 95%. When aggregated by

height, there is not a significant difference between the four classes. This is

probably explained by the fact the central areas are generally more in de-

Figure 30: FSI vs income by classes of urban typology.

mand for their centrality and can be afforded only by higher income classes

and density is just the consequence of more people wanting to live there.

Figure 30 (p.62) looks at the relationship between the density of built en-

vironment and the income, by comparing FSI and Median Income by Lower

Layer Super Output Areas (LSOA). The graph shows a small significant

positive correlation but the linear model explains just 4% of the variation.

It is interesting, though, to look at local dynamics, to understand whether,

within a relatively homogenous area, such as a borough, there is a clear

trend that is not visible at the global scale. Figure 31 (p.64) looks at the

same relationship by borough. For clarity of visualisation, it shows only sta-

tistically significant results of linear models that can explain at least 5% of

the variation. Here there is a clear difference between the central boroughs,

where the correlation is positive and relatively stronger, and the outer ones,

where the correlation is inverted and also weaker. It appears that within

central and more affluent areas, there is a trend for the higher income indi-

viduals to move towards denser areas, where in peripheral areas the opposite

is true. In general, though, the concentration of the x values around a small

range of FSI and the presence of outliers makes the interpretation of the

relationship problematic.

Figure 31: FSI v median income by borough.

4.5 Classes of neighbourhoods and population density

Figure 32: Population density by classes of urban typology.

Although the built environment density increases by class of coverage (figure

27, p.59), the population density hardly follows this trend (figure 32, p.65).

The graph shows a very mild positive trend and FSI explains just 15%

of the variation in population density (figure 33, p.66), which is quite an

unintuitive result. This is evident when we compare the population density

in figure 35 (p.77) with the classification in figure 37 (p.79): the central area

boroughs of City of London and Westminster with the highest concentration

of high-density classes have also the lowest population density. One possible

explanation is that high density blocks in London have mostly business and

commercial character, where the favourite neighbourhood type for living in

the UK is by far a low density one.

Figure 33: Built environment density (FSI) v population density.

5 Conclusions

In these pages, I have briefly summarised two radically different views of the

city, which stem from different cultural backgrounds and result in different

approaches towards the city. Although I gave a very limited picture of the

current debate and practice around the urban planning and design, it was

functional to introduce the question of the urban form and its classification.

Most of the studies that I have found in literature, including the one I took

as a reference, have been applied on relatively small portions of the cities

without much automatisation. The work presented here, which stemmed

from these considerations, had the aim of building a workflow characterised

by effectiveness in managing readily available data, scalability in extend-

ing the analysis to large geographical extents and simplicity in the use of

intuitive quantities and indexes. The results of the classification have an

immediate descriptive power but they need further refinement to be able to

catch the vast array of nuances that the urban landscape offers. In improv-

ing the descriptive power, though, it is useful to keep in mind the need for

a balance between general and detail descriptive power and the necessity of

disseminating results to different disciplines.

At one point in the development of this study, I have faced the question

of whether to use a manual classification by setting the thresholds relying

mostly on my cultural experience, or to use classification techniques such

as k −means or DbScan to extract classes out of geometrical features. Af-

ter some tests, I decided to go for the manual approach, because I was not

convinced of the relevancy of the resulting classes obtained by running the

classification algorithms on only two parameters. I think an interesting ap-

proach would have been to run classification algorithms on a much wider

set of geometrical parameters and then trying interpreting the results based

on urban typo-morphology, a method tested on a small neighbourhood, for

example, in Gil et al. (2012). Unless already available as datasets, features

used in Gil et al. (2012), such as pavement width, street width, building ori-

entation, pedestrian area, each require a manual definition or specific GIS

techniques to calculate them from a normal city survey. The first way is not

applicable in large extents such as Greater London and the second would

require resources outside the scope of this work. The upside of the approach

described in these pages lies in the use of widely available datasets such as

the city survey or the street network and in the simplicity and immediacy of

the results. The result is a tiny step towards the development of a process

easily understandable and replicable by non-specialised audience.

References

Alexander, E. R. (1993), ‘Density measures: A review and analysis’, Journal

of Architectural and Planning Research pp. 181–202.

Alterman, R. (1997), ‘The challenge of farmland preservation: lessons from

a six-nation comparison’, Journal of the American Planning Association

63(2), 220–243.

American Planning Association (1999), Planning communities for the 21st

century, The Association.

Arcaute, E., Hatna, E., Ferguson, P., Youn, H., Johansson, A. & Batty, M.

(2013), ‘Constructing cities, deconstructing scaling laws’, ArXiv e-prints

Azhar, S. (2011), ‘Building information modeling (bim): Trends, benefits,

risks, and challenges for the aec industry’, Leadership and Management

in Engineering 11(3), 241–252.

Batty, M. (2013), The new science of cities, MIT press.

Berghauser Pont, M. & Haupt, P. (2005), ‘The Spacemate: Density and the

Typomorphology of the Urban Fabric’, Urbanism Laboratory for Cities

and Regions Progress of Research Issues in Urbanism 2007 4(4), 55–68.

Berghauser Pont, M. & Haupt, P. (2010), Space, density and urban form,

NAi Publishers Rotterdam.

Burchell, R. W. & Listokin, D. (1995), Land, infrastructure, housing costs

and fiscal impacts associated with growth: The literature on the impacts

of sprawl versus managed growth, Technical report, Lincoln Institute of

Land Policy.

Burchell, R. W., Shad, N. A., Listokin, D., Phillips, H., Downs, A., Seskin,

S., Davis, J. S., Moore, T., Helton, D. & Gall, M. (1998), The costs of

sprawl-revisited, Transit Cooperative Research Program.

Burton, T. & Matson, L. (1996), ‘Urban footprints: making best use of

urban land and resources?a rural perspective’, The compact city: A sus-

tainable urban form pp. 298–301.

CENTRO (2012), Annual Statistical Report, Technical report, West Mid-

lands Integrated Transport Authority.

Cervero, R. & Guerra, E. (2011), Urban densities and transit: A multi-

dimensional perspective, Institute of Transportation Studies, University

of California, Berkeley.

CETAT (1986), Indicateurs morphologiques pour l’amenagement: analyse de

50 perimetres batis situes sur le Canton de Geneve. Presentation generale.

Vol. 1, Departement des traveaux publics de Geneve.

URL: https://books.google.co.uk/books?id=s4fWZwEACAAJ

Churchman, A. (1999), ‘Disentangling the concept of density’, Journal of

Planning Literature 13(4), 389–411.

Commission of the European Communities (1990), ‘Green paper on the ur-

ban environment’.

Conzen, M. P. (1978), ‘Analytical approaches to the urban landscape’, Di-

mensions of human geography. Chicago: University of Chicago pp. 128–65.

Corbusier, L. (1964), La ville radieuse, Vincent, Freal & Cie.

Dantzig, G. B. & Saaty, T. L. (1973), Compact city: a plan for liveable urban

environment, Freeman.

Dark, S. J. & Bram, D. (2007), ‘The modifiable areal unit problem (maup)

in physical geography’, Progress in Physical Geography 31(5), 471–479.

ECOTEC (1993), ‘Reducing transport emissions through planning’.

EEA (1999), Environment in the European Union at the turn of the century,

Technical report, European Environment Agency (EEA).

Ewing, R. (1997), ‘Is Los Angeles-style sprawl desirable?’, Journal of the

American planning association 63(1), 107–126.

Frank, J. E. (1989), The costs of alternative development patterns: A review

of the literature, Urban Land Inst.

Gil, J., Beirao, J. N., Montenegro, N. & Duarte, J. P. (2012), ‘On the

discovery of urban typologies: data mining the many dimensions of urban

form’, Urban morphology 16(1), 27.

Gordon, P. & Richardson, H. W. (1997), ‘Are Compact Cities a Desir-

able Planning Goal?’, Journal of the American Planning Association

63(1), 95–106.

Greater London Authority (2006), ‘London plan’.

Hillier, B. (2007), ‘Space is the machine: a configurational theory of archi-

tecture’.

Jackson, K. T. (1985), Crabgrass frontier: The suburbanization of the United

States, Oxford University Press.

Jacobs, J. (1961), The death and life of great American cities, Vintage.

Krupat, E. (1985), People in cities: The urban environment and its effects,

number 6, Cambridge University Press.

Leccese, M. & McCormick, K. (2000), Charter of the new urbanism,

McGraw-Hill Professional.

Marshall, S. (2009), ‘Cities, design and evolution’.

Martin, L. & March, L. (1972), Urban space and structures, number 1, Cam-

bridge University Press.

Maryland Department of Planning (1997).

Melia, S., Parkhurst, G. & Barton, H. (2011), ‘The paradox of intensifica-

tion’, Transport Policy 18(1), 46–52.

Moudon, A. V. (1994), Getting to know the building landscape: typomor-

phology, in K. Franck & L. Schneekloth, eds, ‘Ordering space: types in

architecture and design’, Van Nostrand Reinhold, New York, pp. 289–311.

Moudon, A. V. & Lee, C. (2009), ‘Urbanism by numbers’, Making the

Metropolitan Landscape: Standing Firm on Middle Ground p. 57.

Muratori, S. (1960), Studi per un’operante storia urbana di Venezia, Istituto

poligrafico dello Stato, Roma.

Muthesius, S. (1982), The English terraced house, Vol. 140, Yale University

Press New Haven.

National Research Council (1999), Our common journey: a transition to-

ward sustainability, National Academies Press.

Newman, P. (1992), ‘The compact city: an australian perspective’, Built

Environment (1978-) pp. 285–300.

Newman, P. G. & Kenworthy, J. R. (1989a), Cities and automobile depen-

dence: An international sourcebook, Gower Publishing.

Newman, P. W. & Kenworthy, J. R. (1989b), ‘Gasoline consumption and

cities: a comparison of us cities with a global survey’, Journal of the

american planning association 55(1), 24–37.

OECD (2012), Compact City Policies: A Comparative Assesment, OECD

Publishing.

Patricios, N. (2002), ‘Urban design principles of the original neighborhood

concepts’, Urban morphology 6(1).

Per, A. F. (2008), D Book: Density, Data, Diagrams, Dwellings, a+ t edi-

ciones.

Pont, M. B. & Marcus, L. (2014), ‘Innovations in measuring density: From

area and location density to accessible and perceived density’, Nordic

Journal of Architectural Research 26(2).

Priest, D. (1977), Large-scale development: benefits, constraints, and state

and local policy incentives, Urban Land Institute.

Ratti, C., Baker, N. & Steemers, K. (2005), ‘Energy consumption and urban

texture’, Energy and Buildings 37(7), 762–776.

Real Estate Research Corporation (1974), The costs of sprawl: Environmen-

tal and economic costs of alternative residential development patterns at

the urban fringe, Technical report, Council on Environmental Quality.

Schumacher, P. (2009), ‘Parametricism: A new global style for architecture

and urban design’, Architectural Design 79(4), 14–23.

Schumacher, P. (2011), The Autopoiesis of Architecture: a new framework

for Architecture, Vol. 1, John Wiley & Sons.

Sir Howard, E. (1898), To-morrow: A Peaceful Path to Real Reform, Rout-

ledger.

Stretton, H. (1996), Density, Efficacy and Equality in Australian Cities, in

M. Jenks, E. Burton & K. Williams, eds, ‘The Compact City: a Sus-

tainable Urban Form?’, E&FN SPON, London and New York, chapter

Compact City Theory.

Thomas, D. (1963), ‘London’s green belt: the evolution of an idea’, The

Geographical Journal 129(1), 14–24.

UK Government (1994), ‘Sustainable development: The UK strategy’.

United Nations (1992), Agenda 21, Technical report, United Nations.

Williams, K., Burton, E. & Jenks, M. (1996), ‘Achieving the compact city

through intensification: an acceptable option’, The compact city: A sus-

tainable urban form pp. 83–96.

Windsor, D. (1979), ‘A critique of the costs of sprawl’, Journal of the Amer-

ican Planning Association 45(3), 279–292.

6 Appendix

6.1 London data: overview maps

Figure 34: London data: building heights

Figure 35: London data: population density by LSOA

Figure 36: London data: median income by LSOA

6.2 Classification results: overview maps

Figure 37: Classification method Bl: London overview

Figure 38: Classification method Bl400: London overview

Figure 39: Classification method Pl: London overview

Figure 40: Classification method Pl150: London overview

6.3 Case studies: detailed maps

Figure 41: Case study locations

6.3.1 Case study I: Angel

Figure 42: Angel: Satellite view and building heights

Figure 43: Angel: Method Bl by block and Bl400 by block in range 400 m.

Figure 44: Angel: Method Pl by plot and Pl150 by plot in range 150 m.

6.3.2 Case study II: Bank

Figure 45: Bank: Satellite view and building heights

Figure 46: Bank: Method Bl by block and Bl400 by block in range 400 m.

Figure 47: Bank: Method Pl by plot and Pl150 by plot in range 150 m.

6.3.3 Case study III: East Croydon

Figure 48: East Croydon: Satellite view and building heights

Figure 49: East Croydon: Method Bl by block and Bl400 by block in range 400 m.

Figure 50: East Croydon: Method Pl by plot and Pl150 by plot in range 150 m.

6.3.4 Case study IV : Emerson Park

Figure 51: Emerson Park: Satellite view and building heights

Figure 52: Emerson Park: Method Bl by block and Bl400 by block in range 400

Figure 53: Emerson Park: Method Pl by plot and Pl150 by plot in range 150 m.

6.3.5 Case study V : Swiss Cottage

Figure 54: Swiss Cottage: Satellite view and building heights

Figure 55: Swiss Cottage: Method Bl by block and Bl400 by block in range 400

Figure 56: Swiss Cottage: Method Pl by plot and Pl150 by plot in range 150 m.

6.4 Code

6.4.1 Data import

1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban

Ana lyt i c s at CASA − UCL

2 # This s c r i p t i s used to import data in to postgreSQL / postGIS

t a b l e s

3 from subproces s import run

4 import os

5 import datet ime

6 import psycopg2

10 #### OS Mastermap − Topographic l a y e r ####

11 ## Bui ld ing f o o t p r i n t shapes

12 # In order to import gml f i l e s o f the OS Mastermap topographic

layer , the s c r i p t nav iga te s through the data f o l d e r

13 # r e t r i v e s the name o f the f i l e s with a s p e c i f i c ex t ens i on and

run ogr2ogr in the command s h e l l

14 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n

/03 Data/London/OS/ ’

15 ogr s tatement = ’ ogr2ogr −f ”PostgreSQL” PG: ” host=l o c a l h o s t

dbname=msc user=pos tg r e s password=pos tg r e s schemas=

london bu i l d i ng s ” SPATIAL INDEX = FALSE ’

16 f i l e n a m e s = [ ]

17 f i l e r o o t s = [ ]

18 # Retr i eve f i l e names and paths

19 # See http :// s tackove r f l ow . com/ que s t i on s /3964681/ f ind−a l l− f i l e s −

in−d i r e c to ry−with−extens ion−txt−in−python

20 f o r root , d i r s , f i l e s in os . walk ( path ) :

21 f o r f i l e in f i l e s :

22 i f f i l e . endswith ( ” . gml . gz” ) :

23 f i l e r o o t s . append ( root )

24 f i l e n a m e s . append ( f i l e )

26 terminal command = ’ cd ’ + f i l e r o o t s [ 0 ] + ’ ; ’ + ogr s tatement

+ f i l e n a m e s [ 0 ]

27 # Star t import ing proce s s

28 dt = datet ime . datet ime . now ( )

29 pr in t ( ”Now running on” + f i l e r o o t s [ 0 ] + ”/” + f i l e n a m e s [ 0 ] )

30 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .

minute ) . z f i l l ( 2 ) )

31 r e turn code = run ( terminal command , s h e l l=True ) # import the

f i r s t f i l e , which c r e a t e s the t ab l e

33 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute ) .

z f i l l ( 2 ) )

34 # Al l the o the r s are appended to the newly c rea ted t a b l e

35 return msg = [ ]

36 f o r i in range (1 , l en ( f i l e n a m e s ) ) :

37 f i l e n a m e = f i l e n a m e s [ i ]

38 f i l e r o o t = f i l e r o o t s [ i ]

39 terminal command = ’ cd ’ + f i l e r o o t + ’ ; ’ + ogr statement +

f i l e n a m e

41 pr in t ( ”Now running on” + f i l e r o o t + ”/” + f i l e n a m e )

minute ) . z f i l l ( 2 ) )

43 return msg . append ( run ( terminal command + ’ −append ’ , s h e l l=

True ) )

45 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute

) . z f i l l ( 2 ) )

46 pr in t ( ”Job done” )

47 # Create geometr ic index

48 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,

password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )

49 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )

50 cur = conn . cur so r ( )

51 cur . execute ( ’ ’ ’

52 CREATE INDEX topographicarea wkb geometry geom idx

53 ON london bu i l d i ng s . topograph icarea

54 USING g i s t

55 ( wkb geometry ) ;

56 CREATE INDEX topographic l ine wkb geometry geom idx

57 ON london bu i l d i ng s . t o p o g r a p h i c l i n e

58 USING g i s t

60 CREATE INDEX cartographictext wkb geometry geom idx

61 ON london bu i l d i ng s . c a r t o g r a p h i c t e x t

62 USING g i s t

64 ’ ’ ’ )

65 conn . commit ( )

66 conn . c l o s e ( )

Listing 1: Data import OS TopographicLayer.py

t a b l e s

4 import os

5 import datet ime

6 import psycopg2

8 #### INSPIRE ####

9 ## Cadastra l p a r c e l s

/03 Data/London/INSPIRE/ ’

london p lo t s ” SPATIAL INDEX = FALSE ’

12 f i l e n a m e s = [ ]

13 f i l e r o o t s = [ ]

16 i f f i l e . endswith ( ” . gml” ) :

19 # The import c r e a t e s the ta b l e

+ f i l e n a m e s [ 0 ]

minute ) . z f i l l ( 2 ) )

f i r s t f i l e to c r e a t e the ta b l e

z f i l l ( 2 ) )

29 return msg = [ ]

f i l e n a m e

minute ) . z f i l l ( 2 ) )

37 return msg . append ( run ( terminal command + ’ −append ’ , s h e l l=

True ) )

40 # Clean geometry and c r e a t e geometry index

45 CREATE TABLE london p lo t s . p l o t s

47 SELECT ∗ FROM london p lo t s . p r ede f i ned ;

48 UPDATE london p lo t s . p l o t s

49 SET wkb geometry = cleanGeometry ( wkb geometry ) ;

50 ALTER TABLE london p lo t s . p l o t s ADD PRIMARY KEY ( o g c f i d ) ;

51 CREATE INDEX plot s geom idx

52 ON london p lo t s . p l o t s

53 USING g i s t

55 ’ ’ ’ )

57 conn . c l o s e ( )

Listing 2: Data import InspireCadastralPlots.py

t a b l e s

4 import os

5 import datet ime

6 import psycopg2

9 #### OS Mastermap − ITN ####

10 ## ITN f i l e o f the t r a n s p o r t a t i o n network o f England

/03 Data/OS ITN−Ful l England / ’

eng l and i tn ” SPATIAL INDEX = FALSE ’

13 f i l e n a m e s = [ ]

14 f i l e r o o t s = [ ]

17 i f f i l e . endswith ( ” . gml” ) :

20 # The import c r e a t e s the ta b l e ( http ://www. gdal . org / drv pg . html )

+ f i l e n a m e s [ 0 ]

minute ) . z f i l l ( 2 ) )

f i r s t f i l e which c r e a t e s the ta b l e

z f i l l ( 2 ) )

30 r e tu rn msg i tn = [ ]

f i l e n a m e

minute ) . z f i l l ( 2 ) )

38 r e tu rn msg i tn . append ( run ( terminal command + ” −s k i p f a i l u r e s −

append” , s h e l l=True ) )

40 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute

) . z f i l l ( 2 ) )

42 # Create geometr ic index

47 CREATE INDEX ferrynode wkb geometry geom idx

48 ON eng land i tn . f e r rynode

49 USING g i s t

51 CREATE INDEX informat ionpoint wkb geometry geom idx

52 ON eng land i tn . in f o rmat ionpo in t

53 USING g i s t

55 CREATE INDEX roadl ink wkb geometry geom idx

56 ON eng land i tn . r oad l i nk

57 USING g i s t

59 CREATE INDEX roadl ink in format ion wkb geometry geom idx

60 ON eng land i tn . r oad l i nk in f o rmat i on

61 USING g i s t

63 CREATE INDEX roadnode wkb geometry geom idx

64 ON eng land i tn . roadnode

65 USING g i s t

67 CREATE INDEX roadroute in format ion wkb geometry geom idx

68 ON eng land i tn . roadroute in fo rmat ion

69 USING g i s t

71 ’ ’ ’ )

73 conn . c l o s e ( )

Listing 3: Data import OS ITN.py

t a b l e s

3 import os

4 import datet ime

5 import psycopg2

8 #### OS Mastermap ####

9 ## Bui ld ing Heights

10 # Create t a b l e

15 DROP TABLE london bu i l d i ng s . b u i l d i n g h e i g h t s CASCADE;

16 CREATE TABLE london bu i l d i ng s . b u i l d i n g h e i g h t s (

17 o s topo to id d i g imap VARCHAR(48) ,

18 o s t o p o t o i d VARCHAR(48) NOT NULL,

19 o s t o p o v e r s i o n VARCHAR(48) ,

20 bha proces sdate VARCHAR(24) ,

21 t i l e r e f VARCHAR(24) ,

22 abshmin NUMERIC(5 , 2) ,

23 absh2 NUMERIC(5 , 2) ,

24 abshmax NUMERIC(5 , 2) ,

25 r e l h2 NUMERIC(5 , 2) ,

26 relmax NUMERIC(5 , 2) ,

27 bha conf VARCHAR(24) ,

28 PRIMARY KEY ( o s t o p o t o i d )

29 ) ;

30 ’ ’ ’ )

32 conn . c l o s e ( )

33 # Clean the data from d u p l i c a t e s

/03 Data/London/OS/ ’

35 f i l e p a t h s = [ ]

36 f i l e n a m e s = [ ]

37 b u i l d i n g h e i g h t s = pd . DataFrame ( )

40 i f f i l e . endswith ( ” . csv ” ) :

41 f i l e p a t h s . append ( root + ’ / ’ + f i l e )

43 f o r f i l e in f i l e p a t h s :

44 bh temp = pd . r ead c sv ( f i l e )

45 b u i l d i n g h e i g h t s = b u i l d i n g h e i g h t s . append ( bh temp )

46 dup l i ca t ed = b u i l d i n g h e i g h t s . dup l i ca t ed ( ) # There are a t o t a l

o f 7 rows complete ly dup l i ca t ed

47 b u i l d i n g h e i g h t s . d r o p d u p l i c a t e s ( keep=’ f i r s t ’ , i n p l a c e=True )

48 d u p l i c a t e d i d s = b u i l d i n g h e i g h t s . dup l i ca t ed ( ’ o s t o p o t o i d ’ )

49 d u p l i c a t ed i d s d i g i m a p = b u i l d i n g h e i g h t s . dup l i ca t ed ( ’

o s t opo to id d i g imap ’ ) # No other d u p l i c a t e s

50 b u i l d i n g h e i g h t s . t o c s v ( ’ / Users / ducc ioa /CLOUD/

C07 UCL SmartCities /08 D i s s e r t a t i o n /03 Data/London/OS/

b u i l d i n g h e i g h t s . csv ’ , index = False , i n d e x l a b e l= False )

51 # Import to Pos tg r e sq l

55 f i l e = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n

/03 Data/London/OS/ b u i l d i n g h e i g h t s . csv ’

57 pr in t ( ”START: ” + f i l e )

minute ) . z f i l l ( 2 ) )

59 sq l s t a t ement = ’COPY london bu i l d i ng s . b u i l d i n g h e i g h t s FROM \ ’ ’

+ f i l e + ’ \ ’ CSV HEADER; ’

60 cur . execute ( ’ ’ ’%s ’ ’ ’ %sq l s t a t ement )

z f i l l ( 2 ) )

64 conn . c l o s e ( )

Listing 4: Data import BuildingHeights.py

t a b l e s

4 import datet ime

5 import psycopg2

6 from sqla lchemy import c r e a t e e n g i n e

7 import pandas as pd

8 #### London ’ s a d m i n i s t r a t i v e boundar ies ####

9 ## London ’ s boroughs

10 # Create SCHEMA

15 DROP SCHEMA london CASCADE;

16 CREATE SCHEMA london

17 AUTHORIZATION pos tg r e s ;

18 ’ ’ ’ )

20 conn . c l o s e ( )

21 # Import

22 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /04

Spat ia lDataCapture /00 Coursework / LondonGentr i f i ca t ion /Data/

ESRI/Boroughs/ ’

23 f i l ename = ’ england lad 2011Polygon . shp ’

24 s h a p e f i l e = path + f i l ename

25 schema = ’ london . ’

26 t ab l e = ’ boroughs ’

27 opt ions = ’−I −s 27700 ’

28 s e r v e r = ’ | psq l −d msc −U pos tg r e s −W’

29 terminal command = ’ shp2pgsql %s %s %s%s %s ’ %(opt ions ,

s h a p e f i l e , schema , tab le , s e r v e r )

31 pr in t ( ’ Importing ”%s ” ’ %(f i l ename ) )

minute ) . z f i l l ( 2 ) )

z f i l l ( 2 ) )

37 ## London ’ s wards

38 # Import

39 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /04

Spat ia lDataCapture /00 Coursework / LondonGentr i f i ca t ion /Data/

ESRI/ ’

40 f i l ename = ’ London Ward CityMerged . shp ’

43 t ab l e = ’ wards ’

44 opt ions = ’−I −s 27700 ’

minute ) . z f i l l ( 2 ) )

z f i l l ( 2 ) )

54 ## Greater London

59 DROP TABLE london . g rea te r l ondon CASCADE;

60 CREATE TABLE london . g rea te r l ondon

62 SELECT ST union (geom) geom FROM london . boroughs ;

63 ALTER TABLE london . g rea te r l ondon

64 ADD COLUMN id BIGSERIAL PRIMARY KEY;

65 CREATE INDEX greater london geom idx

66 ON london . g rea te r l ondon

67 USING g i s t

68 (geom) ;

69 ’ ’ ’ )

71 conn . c l o s e ( )

73 ## River Thames − Remove r i v e r from blocks

74 terminal command=” shp2pgsql −I −s 27700 / Users / ducc ioa /CLOUD/

C07 UCL SmartCities /08 D i s s e r t a t i o n /03 Data/London/

River Thames/ S i m p l i f i e d / r ive r thames . shp support . r i ve r thames

−e x p o l o d e c o l l e c t i o n s | psq l −d msc −U pos tg r e s −W”

75 r e turn code = run ( terminal command , s h e l l=True )

80 CREATE TABLE london . r i ve r thames (

81 id s e r i a l ,

82 geom geometry

83 ) ;

84 INSERT INTO london . r i ve r thames (geom)

85 (SELECT ( st dump (geom) ) . geom from support . r i ve r thames ) ;

86 ALTER TABLE london . r i ve r thames

87 ADD PRIMARY KEY ( id ) ;

88 CREATE INDEX river thames geom idx

89 ON london . r i ve r thames

90 USING g i s t

91 (geom) ;

92 −− Find b locks that f a l l with in the r i v e r ’ s shape

93 CREATE TABLE support . r i v e r b l o c k s AS

95 SELECT p . b lock id , p . wkb geometry

96 FROM london b locks . b locks AS p

97 INNER JOIN london . r i ve r thames AS n

98 ON ST within (p . wkb geometry , ST buf fe r (n . geom , 20) )

99 ) ;

100 ALTER TABLE support . r i v e r b l o c k s

101 ADD PRIMARY KEY ( b l o c k i d ) ;

102 CREATE INDEX r i v e r b l o c k s g e o m i d x

103 ON support . r i v e r b l o c k s

104 USING g i s t

106 INSERT INTO support . r i v e r b l o c k s

107 (SELECT block id , wkb geometry FROM london b locks . b locks

WHERE b l o c k i d IN (60874 , 47766 ,48147 ,48111) ) ;

108 DELETE FROM london b locks . b locks

109 WHERE b l o c k i d IN (SELECT b l o c k i d FROM london b locks . b locks

WHERE b l o c k i d IN (SELECT b . b l o c k i d FROM support .

r i v e r b l o c k s b) ) ;

110 ’ ’ ’ )

112 conn . c l o s e ( )

Listing 5: Data import LondonAdministrativeBoundaries.py

6.4.2 Data cleaning

1 −− DROP SCHEMA london i tn topo l ogy CASCADE;

2 −− DELETE FROM topology . topology WHERE name = ’

l ondon i tn topo l ogy ’ ;

3 SELECT CreateTopology ( ’ l ondon i tn topo l ogy ’ , 27700 , 0 . 1 ) ;

4 CREATE SCHEMA temp itn

5 AUTHORIZATION pos tg r e s ;

6 −− Barking and Dagenham 22

7 BEGIN;

8 SET LOCAL work mem = ’ 96MB’ ;

9 c r e a t e t a b l e temp itn . road l ink22 as (

10 s e l e c t ∗ from london i tn . r oad l i nk

11 where S T i n t e r s e c t s ( wkb geometry , ( s e l e c t geom from london .

boroughs where name = ’ Barking and Dagenham ’ ) )

12 ) ;

13 a l t e r t a b l e temp itn . road l ink22

14 add primary key ( o g c f i d ) ;

15 c r e a t e index r o a d l i n k 2 2 s p a t i a l i d x

16 on temp itn . road l ink22

17 us ing g i s t

19 COMMIT;

21 BEGIN;

23 SELECT

24 o g c f i d ,

25 TopoGeo AddLineString (

26 ’ l ondon i tn topo l ogy ’ , wkb geometry

27 ) As edge id

28 FROM (

29 SELECT o g c f i d , wkb geometry FROM temp itn . road l ink22

30 ) As f ;

31 COMMIT;

32 −− The proce s s has to be repeated f o r each o f the 33 boroughs

33 −− Railway

34 CREATE TABLE london i tn . ra i lway AS (

35 SELECT

36 ) ;

37 BEGIN;

39 c r e a t e t a b l e temp itn . ra i lway as (

40 s e l e c t ∗ from london i tn . r oad l i nk

41 where S T i n t e r s e c t s ( wkb geometry , ( s e l e c t geom from london .

boroughs where name = ’ Westminister ’ ) )

42 ) ;

43 a l t e r t a b l e temp itn . road l ink30

44 add primary key ( o g c f i d ) ;

45 c r e a t e index r o a d l i n k 3 0 s p a t i a l i d x

46 on temp itn . road l ink30

47 us ing g i s t

49 COMMIT;

51 BEGIN;

53 SELECT

54 o g c f i d ,

56 ’ l ondon i tn topo l ogy ’ , wkb geometry

57 ) As edge id

58 FROM (

59 SELECT o g c f i d , wkb geometry FROM temp itn . road l ink30

60 ) As f ;

61 COMMIT;

64 −− RAILWAY

65 BEGIN;

67 SELECT

68 gid ,

70 ’ l ondon i tn topo l ogy ’ , s t l i n e m e r g e (geom)

71 ) As edge id

72 FROM (

73 SELECT gid , geom FROM london i tn . o v e r g r o u n d r a i l

74 ) As f ;

75 COMMIT;

77 −− RIVER THAMES

78 BEGIN;

80 SELECT

81 id ,

83 ’ l ondon i tn topo l ogy ’ , ST Exter iorRing (geom)

84 ) As edge id

85 FROM (

86 SELECT id , geom FROM london . r i ve r thames

87 ) As f ;

88 COMMIT;

92 −−−− Create block polygons −−−−

93 −− DROP SCHEMA london b locks CASCADE;

94 −− DROP TABLE london b locks . b locks CASCADE;

95 −− CREATE SCHEMA london b locks

96 −− AUTHORIZATION pos tg r e s ;

97 CREATE TABLE london b locks . b locks (

98 b l o c k i d int ,

99 wkb geometry geometry ,

100 a r ea b l o ck rea l ,

101 compact block rea l ,

102 pe r imet e r b l o ck rea l ,

103 borough code charac t e r varying ,

104 borough name charac t e r vary ing

105 ) ;

107 DO

108 $do$

109 DECLARE i i n t ;

110 BEGIN

111 FOR i IN SELECT f a c e i d FROM london i tn topo l ogy . f a c e WHERE

f a c e i d !=0 LOOP

112 INSERT INTO london b locks . b locks (SELECT i ,

ST GetFaceGeometry ( ’ l ondon i tn topo l ogy ’ , i ) ) ;

113 END LOOP;

114 END

115 $do$ ;

116 ALTER TABLE london b locks . b locks

118 CREATE INDEX london b locks

119 ON london b locks . b locks

120 USING g i s t

124 DELETE FROM london b locks . b locks

125 WHERE ST within (

126 (SELECT ST centro id ( wkb geometry ) FROM london b locks . b locks )

127 (SELECT geom FROM london . r i ve r thames )

128 ) ;

129 UPDATE london b locks . b locks SET area b l o ck = ST area (

wkb geometry ) ;

130 UPDATE london b locks . b locks SET compact block = area b l o ck /(

ST Area ( ST MinimumBoundingCircle ( wkb geometry ) ) ) ;

131 UPDATE london b locks . b locks SET per imet e r b l o ck = ST perimeter (

wkb geometry ) ;

Listing 6: Data createTopology.sql

1 −− Dupl icate topographic t a b l e as backup measure

2 c r e a t e t a b l e l ondon bu i l d i ng s . b u i l d i n g s h a p e s

4 s e l e c t ∗ from london bu i l d i ng s . topograph icarea ;

5 −− add new columns

6 a l t e r t a b l e l ondon bu i l d i ng s . b u i l d i n g s h a p e s −− done

7 add column r e l h r e a l d e f a u l t 0 ,

8 add column area r e a l d e f a u l t 0 ,

9 add column compactness r e a l d e f a u l t 0 ,

10 add column n f l o o r s i n t e g e r d e f a u l t 0 ;

13 −− Join with the b u i l d i n g h e i g h t s t ab l e

14 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s s SET r e l h = h . r e l h2

FROM london bu i l d i ng s . b u i l d i n g h e i g h t s h WHERE s . f i d = h .

o s t o p o t o i d ; −− done

15 −− Add area

16 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET area = ST area ( s .

wkb geometry ) FROM london bu i l d i ng s . b u i l d i n g s h a p e s s ;

17 −− Add n f l o o r s

18 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET n f l o o r s = s . r e l h /3

FROM london bu i l d i ng s . b u i l d i n g s h a p e s s ;

19 −− Add compactness

20 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET compactness = s . area

/( ST Area ( ST MinimumBoundingCircle ( s . wkb geometry ) ) FROM

london bu i l d i ng s . b u i l d i n g s h a p e s s ;

21 −− Add n f l o o r s with average 3 .5

22 a l t e r t a b l e l ondon bu i l d i ng s . b u i l d i n g s h a p e s

23 add column n f l o o r s 3 5 0 i n t e g e r d e f a u l t 0 ;

24 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET n f l o o r s = s . r e l h

/3 .5 FROM london bu i l d i ng s . b u i l d i n g s h a p e s s ;

Listing 7: Data JoinBuildingHeights.sql

1 −− Subset the b u i l d i n g s h a p e s t a b l e

2 CREATE TABLE london bu i l d i ng s . shapes AS

3 SELECT

4 b u i l d i n g s h a p e s . o g c f i d ,

5 b u i l d i n g s h a p e s . wkb geometry ,

6 b u i l d i n g s h a p e s . f i d ,

7 b u i l d i n g s h a p e s . p h y s i c a l l e v e l ,

8 b u i l d i n g s h a p e s . r e l h ,

9 b u i l d i n g s h a p e s . area ,

10 b u i l d i n g s h a p e s . compactness ,

11 b u i l d i n g s h a p e s . n f l o o r s

12 FROM london bu i l d i ng s . b u i l d i n g s h a p e s

13 WHERE ( ( b u i l d i n g s h a p e s . n f l o o r s > 0) AND ( b u i l d i n g s h a p e s .

area > (6 ) : : double p r e c i s i o n ) ) ;

14 −− Create indexes

15 ALTER TABLE london bu i l d i ng s . shapes

16 ADD PRIMARY KEY ( o g c f i d ) ;

17 CREATE INDEX shapes wkb geomet ry spat i a l i dx

18 ON london bu i l d i ng s . shapes

19 USING g i s t

21 −− Add shapes ’ c e n t r o i d s

22 ALTER TABLE london bu i l d i ng s . shapes

23 ADD COLUMN geom centro ids geometry ;

24 CREATE INDEX s h a p e s c e n t r o i d s s p a t i a l i d x

25 ON london bu i l d i ng s . shapes

26 USING g i s t

27 ( geom centro ids ) ;

28 UPDATE london bu i l d i ng s . shapes SET geom centro ids = ST centro id (

wkb geometry ) ; −− done

29 ALTER TABLE london p lo t s . p l o t s

30 ADD COLUMN geom p lo t c en t ro id s geometry ;

31 UPDATE london p lo t s . p l o t s SET geom p lo t c en t ro id s = ST centro id (

wkb geometry ) ;

32 CREATE INDEX p l o t c e n t r o i d s s p a t i a l i d x

33 ON london p lo t s . p l o t s

34 USING g i s t

35 ( g eom p lo t c en t ro id s ) ;

Listing 8: Data createBuildingShapesTable.sql

1 −− S p a t i a l JOIN blocks−b u i l d i n g s

2 DROP TABLE london index . b l ock index CASCADE;

3 CREATE TABLE london index . b l ock index AS

5 SELECT a . b lock id ,

6 a . wkb geometry , a . a rea b lock , count (b . area ) AS

bu i ld ing count ,

7 SUM(b . area ) AS t o t a l f o o t p r i n t ,

8 SUM(b . area ∗ n f l o o r s ) AS t o t a l f l o o r s u r f a c e ,

9 SUM(b . area ) /a . a r ea b l o ck AS gs i ,

10 stddev samp (b . area ) /avg (b . area ) AS g s i s d ,

11 SUM(b . area ∗ n f l o o r s ) /a . a r ea b l o ck AS f s i ,

12 stddev samp (b . area ∗b . n f l o o r s ) /avg (b . area ∗b . n f l o o r s ) AS

f s i s d ,

13 SUM(b . area ∗b . n f l o o r s ) /SUM(b . area ) AS w avg n f l oo r s

14 FROM london b locks . b locks AS a

15 INNER JOIN london bu i l d i ng s . shapes AS b

16 ON ST Within (b . geom centro ids , a . wkb geometry )

17 GROUP BY a . b lock id , a . wkb geometry , a . a r ea b l o ck

18 ) ;

19 ALTER TABLE london index . b l ock index

21 CREATE INDEX b l o c k i n d e x s p a t i a l i d x

22 ON london index . b l ock index

23 USING g i s t ( wkb geometry ) ;

24 −− S p a t i a l JOIN plot s−b u i l d i n g s

25 DROP TABLE london index . p l o t i n d e x CASCADE;

26 CREATE TABLE london index . p l o t i n d e x AS

28 SELECT a . o g c f i d AS p l o t i d ,

29 a . wkb geometry , a . area AS area p l o t , count (b . area ) AS

bu i ld ing count ,

30 SUM(b . area ) AS t o t a l f o o t p r i n t ,

31 SUM(b . area ∗ n f l o o r s ) AS t o t a l f l o o r s u r f a c e ,

32 SUM(b . area ) /a . area AS gs i ,

33 stddev samp (b . area ) /avg (b . area ) AS g s i s d ,

34 SUM(b . area ∗ n f l o o r s ) /a . area AS f s i ,

35 stddev samp (b . area ∗b . n f l o o r s ) /avg (b . area ∗b . n f l o o r s ) AS

f s i s d ,

36 SUM(b . area ∗b . n f l o o r s ) /SUM(b . area ) AS w avg n f l oo r s

37 FROM london p lo t s . p l o t s AS a

38 INNER JOIN london bu i l d i ng s . shapes AS b

39 ON ST Within (b . geom centro ids , a . wkb geometry )

40 GROUP BY a . o g c f i d , a . wkb geometry , a . area

41 ) ;

42 ALTER TABLE london index . p l o t i n d e x

43 ADD PRIMARY KEY ( p l o t i d ) ;

44 CREATE INDEX p l o t i n d e x s p a t i a l i d x

45 ON london index . p l o t i n d e x

Listing 9: Data joinBlocks.sql

1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Block range 400

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

2 −− DROP TABLE support . t emp b lock cent ro id s CASCADE;

3 CREATE TABLE support . t emp b lock cent ro id s AS (

4 SELECT block id , s t c e n t r o i d ( geom block ) as wkb geometry ,

area b lock ,

5 bu i ld ing count , t o t a l f o o t p r i n t , t o t a l f l o o r s u r f a c e , g s i , f s i

, w avg n f l oo r s

6 FROM london index . b l o c k c l u s t e r l a b e l s )

8 ALTER TABLE support . t emp b lock cent ro id s

10 CREATE INDEX t e m p c e n t r o i d s b l o c k s s p a t i a l i d x

11 ON support . t emp b lock cent ro id s

12 USING g i s t

14 −− DROP TABLE support . t emp b lock bu f f e r s400 CASCADE;

15 CREATE TABLE support . t emp b lock bu f f e r s400 AS (

16 SELECT block id , s t b u f f e r ( s t c e n t r o i d ( geom block ) , 400 , ’

quad segs=2 ’ ) as wkb geometry buffer , geom block FROM

london index . b l o c k c l u s t e r l a b e l s

17 ) ;

18 ALTER TABLE support . t emp b lock bu f f e r s400

20 CREATE INDEX t e m p b l o c k b u f f e r s 4 0 0 s p a t i a l i d x

21 ON support . t emp b lock bu f f e r s400

22 USING g i s t

23 ( wkb geometry buf fer ) ;

24 CREATE INDEX t e m p b l o c k b u f f e r b l o c k s s p a t i a l i d x

25 ON support . t emp b lock bu f f e r s400

26 USING g i s t

27 ( geom block ) ;

29 −− DROP TABLE london b lock range . b lock range400 CASCADE;

30 −−CREATE SCHEMA london b lock range

31 −− AUTHORIZATION pos tg r e s ;

33 CREATE TABLE london b lock range . b lock range400 AS

35 SELECT a . b lock id ,

36 a . geom block as wkb geometry ,

37 SUM( bu i l d ing count ) AS bu i ld ing count ,

38 SUM( t o t a l f o o t p r i n t ) AS t o t a l f o o t p r i n t ,

39 SUM( t o t a l f l o o r s u r f a c e ) AS t o t a l f l o o r s u r f a c e ,

40 SUM(b . g s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e )

AS gs i ,

41 stddev samp (b . g s i ∗b . t o t a l f l o o r s u r f a c e ) /AVG(b . g s i ∗b .

t o t a l f l o o r s u r f a c e ) AS g s i s d ,

42 SUM(b . f s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e )

AS f s i ,

43 stddev samp (b . f s i ∗b . t o t a l f l o o r s u r f a c e ) /AVG(b . f s i ∗b .

t o t a l f l o o r s u r f a c e ) AS f s i s d ,

44 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b .

t o t a l f l o o r s u r f a c e ) AS w avg n f loor s ,

45 stddev samp (b . w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) /AVG(b .

w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) AS w a v g n f l o o r s s d

46 FROM support . t emp b lock bu f f e r s400 AS a

47 INNER JOIN support . t emp b lock cent ro id s AS b

48 ON ST Within (b . wkb geometry , a . wkb geometry buf fer )

49 GROUP BY a . b l o c k i d

50 ) ;

51 ALTER TABLE london b lock range . b lock range400

53 CREATE INDEX p l o t r a n g e 4 0 0 b l o c k s p a t i a l i d x

54 ON london b lock range . b lock range400

Listing 10: Data joinBlockRange400.sql

6.4.3 Classification

2 # This s c r i p t i s used to f e t c h the data from the database and

run the a n a l y s i s

5 import pandas . i o . s q l as psq l

6 pd . s e t o p t i o n ( ’ d i s p l ay . width ’ , 640)

8 #### Create common v a r i a b l e s ####

9 r o o t d i r = ’ / Users / ducc ioa /CLOUD/01 Cloud /01 Work/02 DataSc ience

/UCL SmartCities /08 D i s s e r t a t i o n ’

10 eng ine = c r e a t e e n g i n e ( ’ p o s t g r e s q l : // po s tg r e s : po s t g r e s@ lo ca lho s t

:5432/ msc ’ )

11 g s i l e g e n d = { ’ low coverage ’ : (0 , 0 . 188 ) , ’medium coverage ’ :

( 0 . 1881 , 0 . 277 ) , ’ high coverage ’ : ( 0 . 2771 , 1) } # q u a n t i l e s

12 b u i l d i n g h e i g h t l e g e n d = { ’ low r i s e ’ : (0 , 2 . 5 ) , ’mid−low r i s e ’ :

( 2 . 5 , 6 . 5 ) , ’mid−high r i s e ’ : ( 6 . 5 , 12 . 5 ) , ’ high r i s e ’ : ( 1 2 . 5 ,

200) }

14 #### Load Data ####

15 # Create csv from s q l query ( reqding query takes too long ,

e a s i e r to wr i t e / read csv )

16 psq l . execute ( ”copy ( S e l e c t ∗ From london index . b l ock index ) To ’

” + r o o t d i r +

17 ”/03 Data/DbDump/ b lock index . csv ’ HEADER CSV; ” ,

18 eng ine )

19 df = pd . r ead c sv ( r o o t d i r + ’ /03 Data/DbDump/ b lock index . csv ’ )

20 ########################################### BLOCK CLASSIFICATION

###########################################

21 ####### S i n g l e b locks #######

22 index = df . l o c [ ( df . g s i <= 1) & ( df . f s i >0) ]

25 b u i l d i n g h e i g h t l a b e l s =[ ]

26 f o r i in index . w avg n f l oo r s :

27 f o r key , va lue s in b u i l d i n g h e i g h t l e g e n d . i tems ( ) :

28 i f i >= min ( va lue s ) and i <max( va lue s ) :

29 b u i l d i n g h e i g h t l a b e l s . append ( key )

30 pr in t ( l en ( b u i l d i n g h e i g h t l a b e l s ) )

31 g s i l a b e l s =[ ]

32 f o r i in index . g s i :

33 f o r key , va lue s in g s i l e g e n d . i tems ( ) :

34 i f i >= min ( va lue s ) and i <max( va lue s ) +0.0001:

35 g s i l a b e l s . append ( key )

36 pr in t ( l en ( g s i l a b e l s ) )

37 c l a s s i f i c a t i o n =[ ]

38 f o r i in range (0 , l en ( g s i l a b e l s ) ) :

39 s t = b u i l d i n g h e i g h t l a b e l s [ i ] + ’ − ’ + g s i l a b e l s [ i ]

40 c l a s s i f i c a t i o n . append ( s t )

41 pr in t ( l en ( c l a s s i f i c a t i o n ) )

43 c l a s s i f i c a t i o n=pd . DataFrame ({ ’ b l o c k i d ’ : index . b lock id , ’ l a b e l ’ :

c l a s s i f i c a t i o n } , index=index . index )

44 index = pd . merge ( index , c l a s s i f i c a t i o n )

45 index . r e p l a c e ( ’NaN ’ , 0 , i n p l a c e=True )

46 index = index [ ˜ index . b l o c k i d . i s i n ( [ 60762 , 30571 , 60769 , 30497 ] )

47 summary = index . groupby ( ’ l a b e l ’ ) . d e s c r i b e ( )

48 summary . t o c s v ( r o o t d i r + ’ /03 Data/DbDump/block summary . csv ’ )

49 index . t o c s v ( r o o t d i r + ’ /03 Data/DbDump/ b l o c k c l a s s i f i c a t i o n .

csv ’ , index=False , i n d e x l a b e l=False )

51 psq l . execute ( ”DROP TABLE london index . b l o c k c l u s t e r l a b e l s

CASCADE; ”

52 ”CREATE TABLE london index . b l o c k c l u s t e r l a b e l s ( ”

53 ” b l o c k i d text , ”

54 ” geom block geometry , ”

55 ” a r ea b l o ck f l o a t , ”

56 ” bu i l d ing count int , ”

57 ” t o t a l f o o t p r i n t f l o a t , ”

58 ” t o t a l f l o o r s u r f a c e f l o a t , ”

59 ” g s i f l o a t , ”

60 ” g s i s d f l o a t , ”

61 ” f s i f l o a t , ”

62 ” f s i s d f l o a t , ”

63 ” w avg n f l oo r s f l o a t , ”

64 ” l a b e l varchar (255) ) ; ” , eng ine )

65 psq l . execute ( ”copy london index . b l o c k c l u s t e r l a b e l s from ’/

Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03

Data/DbDump/ b l o c k c l a s s i f i c a t i o n . csv ’ CSV HEADER; ” , eng ine )

66 pr in t ( ’ done ’ )

67 psq l . execute ( ”ALTER TABLE london index . b l o c k c l u s t e r l a b e l s ADD

PRIMARY KEY ( b l o c k i d ) ; ” , eng ine )

68 pr in t ( ’ done ’ )

69 psq l . execute ( ”CREATE INDEX b l o c k c l u s t e r g e o m i d x ON

london index . b l o c k c l u s t e r l a b e l s USING g i s t ( geom block ) ; ” ,

eng ine )

70 pr in t ( ’ done ’ )

72 ####### Range 400 #######

73 #### Load Data ####

75 psq l . execute ( ”copy ( S e l e c t ∗ From london b lock range .

b lock range400 ) To ’/ Users / ducc ioa /CLOUD/C07 UCL SmartCities

/08 D i s s e r t a t i o n /03 Data/DbDump/ block range400 . csv ’ HEADER

CSV; ” ,

76 eng ine )

77 df400 = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/DbDump/ block range400 . csv ’ )

78 index400 = df400 . l o c [ ( df400 . g s i <= 1) & ( df400 . f s i >0) ]

79 ## Manual c l a s s i f i c a t i o n

80 b u i l d i n g h e i g h t l a b e l s 4 0 0 =[ ]

81 f o r i in index400 . w avg n f l oo r s :

84 b u i l d i n g h e i g h t l a b e l s 4 0 0 . append ( key )

85 pr in t ( l en ( b u i l d i n g h e i g h t l a b e l s 4 0 0 ) )

86 g s i l a b e l s 4 0 0 =[ ]

87 f o r i in index400 . g s i :

90 g s i l a b e l s 4 0 0 . append ( key )

91 pr in t ( l en ( g s i l a b e l s 4 0 0 ) )

92 c l a s s i f i c a t i o n 4 0 0 =[ ]

93 f o r i in range (0 , l en ( g s i l a b e l s 4 0 0 ) ) :

94 s t = b u i l d i n g h e i g h t l a b e l s 4 0 0 [ i ] + ’ − ’ + g s i l a b e l s 4 0 0 [ i ]

95 c l a s s i f i c a t i o n 4 0 0 . append ( s t )

96 pr in t ( l en ( c l a s s i f i c a t i o n 4 0 0 ) )

98 c l a s s i f i c a t i o n 4 0 0=pd . DataFrame ({ ’ b l o c k i d ’ : index400 . b lock id , ’

l a b e l ’ : c l a s s i f i c a t i o n 4 0 0 } , index=index400 . index )

99 index400 = pd . merge ( index400 , c l a s s i f i c a t i o n 4 0 0 )

100 index400 . r e p l a c e ( ’NaN ’ , 0 , i n p l a c e=True )

101 summary = index400 . groupby ( ’ l a b e l ’ ) . d e s c r i b e ( )

102 summary . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/DbDump/block400 summary . csv ’ )

103 index400 . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/DbDump/ b l o c k c l a s s i f i c a t i o n 4 0 0 . csv ’ ,

index=False , i n d e x l a b e l=False )

104 psq l . execute ( ”DROP TABLE london index . b l o c k c l u s t e r l a b e l s 4 0 0 ; ”

105 ”CREATE TABLE london index . b l o c k c l u s t e r l a b e l s 4 0 0

106 ” b l o c k i d text , ”

107 ” geom block geometry , ”

108 ” bu i l d ing count int , ”

109 ” t o t a l f o o t p r i n t f l o a t , ”

110 ” t o t a l f l o o r s u r f a c e f l o a t , ”

111 ” g s i f l o a t , ”

112 ” g s i s d f l o a t , ”

113 ” f s i f l o a t , ”

114 ” f s i s d f l o a t , ”

116 ” w a v g n f l o o r s s d f l o a t , ”

117 ” l a b e l varchar (255) ) ; ” , eng ine )

118 psq l . execute ( ”copy london index . b l o c k c l u s t e r l a b e l s 4 0 0 from ’/

Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03

Data/DbDump/ b l o c k c l a s s i f i c a t i o n 4 0 0 . csv ’ CSV HEADER; ” ,

eng ine )

119 pr in t ( ’ done ’ )

120 psq l . execute ( ”ALTER TABLE london index . b l o c k c l u s t e r l a b e l s 4 0 0

ADD PRIMARY KEY ( b l o c k i d ) ; ” , eng ine )

121 pr in t ( ’ done ’ )

122 psq l . execute ( ”CREATE INDEX block c lu s t e r 400 geom idx ON

l ondon index . b l o c k c l u s t e r l a b e l s 4 0 0 USING g i s t ( geom block ) ;

” , eng ine )

123 pr in t ( ’ done ’ )

Listing 11: Data classification blocks.py

6.4.4 Income and population density

1 −− DROP TABLE london index . msoa11 index CASCADE;

2 CREATE TABLE london index . msoa11 index AS

4 SELECT a . gid , a . msoa11cd ,

5 a . geom as wkb geometry ,

6 a . median income 2012 13 , a . people per sq km , a .

mid 2014 populat ion ,

7 SUM(b . g s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS

gs i ,

8 SUM(b . f s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS

f s i ,

9 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s p a c e ) /SUM(b .

t o t a l f l o o r s p a c e ) AS w avg n f l oo r s

10 FROM london . msoa11 pop AS a

11 INNER JOIN support . t e m p p l o t c e n t r o i d s AS b

12 ON ST Within (b . wkb geometry , a . geom)

13 WHERE b . gs i <1

14 GROUP BY a . gid , a . msoa11cd , a . geom , a . median income 2012 13 , a .

people per sq km , a . mid 2014 populat ion

15 ) ;

16 ALTER TABLE london index . msoa11 index

17 ADD PRIMARY KEY ( gid ) ;

18 CREATE INDEX m so a 1 1 i n d ex s p a t i a l i d x

19 ON london index . msoa11 index

22 −− DROP TABLE london index . l s o a 11 i n de x CASCADE;

23 CREATE TABLE london index . l s o a 11 i n de x AS

25 SELECT a . gid , a . l soa11cd ,

26 a . geom as wkb geometry ,

28 SUM(b . g s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS

gs i ,

29 SUM(b . f s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS

f s i ,

30 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s p a c e ) /SUM(b .

t o t a l f l o o r s p a c e ) AS w avg n f l oo r s

31 FROM london . l soa11 pop AS a

35 GROUP BY a . gid , a . l soa11cd , a . geom , a . median income 2012 13 , a .

people per sq km , a . mid 2014 populat ion

36 ) ;

37 ALTER TABLE london index . l s o a 11 i n de x

39 CREATE INDEX l s o a 1 1 i n d e x s p a t i a l i d x

40 ON london index . l s o a 11 i n de x

Listing 12: Data joinIncomeTable.sql

2 # This s c r i p t i s used to f e t c h the data from the database and

run the a n a l y s i s

3 # F i r s t run Data import LondonAdministrat iveBoundaries . py

4 import pandas . i o . s q l as psq l

6 import datet ime

7 import psycopg2

10 pd . s e t o p t i o n ( ’ d i s p l ay . width ’ , 640)

12 #### Create common v a r i a b l e s ####

:5432/ msc ’ )

14 g s i l e g e n d = { ’ low coverage ’ : (0 , 0 . 3 3 ) , ’medium coverage ’ :

( 0 . 330001 , 0 . 6 6 ) , ’ high coverage ’ : ( 0 . 660001 , 1) }

200) }

17 ## London LSOA11

/03 Data/London/ Admini s t rat ive Census Boundar ie s / s t a t i s t i c a l −

g i s−boundaries−london /ESRI/ ’

19 f i l ename = ’LSOA 2011 London gen MHW . shp ’

22 t ab l e = ’ l s oa11 ’

23 opt ions = ’−I −s 27700 ’

minute ) . z f i l l ( 2 ) )

32 ## Populat ion data

33 # Import to Pos tg r e sq l

34 df = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/London/ Admini s t rat ive Census Boundar ie s

/ land−area−populat ion−dens i ty−l s oa11 . csv ’ )

35 df . columns = [ c . lower ( ) f o r c in df . columns ]

36 df . columns = [ c . r e p l a c e ( ’ ’ , ’ ’ ) f o r c in df . columns ]

37 df . columns = [ c . r e p l a c e ( ’− ’ , ’ ’ ) f o r c in df . columns ]

:5432/ msc ’ )

39 df . t o s q l ( ” l s o a 1 1 p o pu l a t i o n ” , engine , schema=’ support ’ )

40 df = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/London/ O f f i c e N a t i o n a l S t a t i s t i c s / l soa−

household−income−e s t imate s . csv ’ )

41 df . columns = [ c . lower ( ) f o r c in df . columns ]

42 df . columns = [ c . r e p l a c e ( ’ ’ , ’ ’ ) f o r c in df . columns ]

43 df . columns = [ c . r e p l a c e ( ’− ’ , ’ ’ ) f o r c in df . columns ]

:5432/ msc ’ )

45 df . t o s q l ( ” l soa11 income ” , engine , schema=’ support ’ )

50 DROP TABLE IF EXISTS london . l soa11 pop CASCADE;

51 CREATE TABLE london . l soa11 pop AS (

52 SELECT t1 . gid , t1 . l soa11cd , t2 . lsoa name , t1 . geom , t2 .

median income 2012 13 , t2 . people per sq km , t2 .

mid 2014 populat ion

53 FROM london . l s oa11 t1

54 JOIN (SELECT t1 . index , t2 . l soa11 code , t2 . lsoa name , t2 .

median income 2012 13 ,

55 t1 . people per sq km , t1 . mid 2014 populat ion FROM support

. l s o a 11 p o pu l a t i o n t1

56 JOIN support . l soa11 income t2

57 ON t1 . l s oa11 code=t2 . l s oa11 code ) t2

58 ON t1 . l soa11cd=t2 . l s oa11 code

59 ) ;

60 ALTER TABLE london . l soa11 pop

62 CREATE INDEX lsoa11 pop geom idx

63 ON london . l soa11 pop

64 USING g i s t (geom) ;

65 DROP TABLE support . l s o a 1 1 p o pu l a t i o n CASCADE;

66 DROP TABLE support . l soa11 income CASCADE;

67 DROP TABLE london . l s oa11 CASCADE;

68 ’ ’ ’ )

70 conn . c l o s e ( )

71 #S p a t i a l j o i n

75 DROP TABLE IF EXISTS london index . l s o a 11 i n de x CASCADE;

76 CREATE TABLE london index . l s o a 11 i n de x AS

78 SELECT a . gid , a . l soa11cd ,

79 a . geom as wkb geometry , a . lsoa name ,

81 SUM(b . g s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e

) AS gs i ,

82 SUM(b . f s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e

) AS f s i ,

83 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b .

t o t a l f l o o r s u r f a c e ) AS w avg n f l oo r s

84 FROM london . l soa11 pop AS a

88 GROUP BY a . gid , a . l soa11cd , a . geom , a . median income 2012 13 ,

a . people per sq km , a . mid 2014 populat ion , a . lsoa name

89 ) ;

90 ALTER TABLE london index . l s o a 11 i n de x

92 CREATE INDEX l s o a 1 1 i n d e x s p a t i a l i d x

93 ON london index . l s o a 11 i n de x

95 ’ ’ ’ )

97 conn . c l o s e ( )

98 pr in t ( ’ done ’ )

100 psq l . execute ( ”copy ( S e l e c t ∗ From london index . l s o a 11 i n de x ) To

’/ Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03

Data/DbDump/ l s o a 1 1 i n de x . csv ’ HEADER CSV; ” ,

101 eng ine ) ;

102 index = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/DbDump/ l s o a 1 1 i n de x . csv ’ )

104 ## Manual c l a s s i f i c a t i o n

105 g s i l e g e n d = { ’ low coverage ’ : (0 , 0 . 188 ) , ’medium coverage ’ :

( 0 . 188001 , 0 . 277 ) , ’ high coverage ’ : ( 0 . 277001 , 1) } #

q u a n t i l e s

200) }

108 b u i l d i n g h e i g h t l a b e l s =[ ]

109 f o r i in index . w avg n f l oo r s :

112 b u i l d i n g h e i g h t l a b e l s . append ( key )

113 pr in t ( l en ( b u i l d i n g h e i g h t l a b e l s ) )

114 g s i l a b e l s =[ ]

115 f o r i in index . g s i :

118 g s i l a b e l s . append ( key )

119 pr in t ( l en ( g s i l a b e l s ) )

120 c l a s s i f i c a t i o n =[ ]

121 f o r i in range (0 , l en ( g s i l a b e l s ) ) :

122 s t = b u i l d i n g h e i g h t l a b e l s [ i ] + ’ − ’ + g s i l a b e l s [ i ]

123 c l a s s i f i c a t i o n . append ( s t )

124 pr in t ( l en ( c l a s s i f i c a t i o n ) )

126 c l a s s i f i c a t i o n=pd . DataFrame ({ ’ g id ’ : index . gid , ’ l a b e l ’ :

c l a s s i f i c a t i o n } , index=index . index )

127 index = pd . merge ( index , c l a s s i f i c a t i o n )

128 summary = index . groupby ( ’ l a b e l ’ ) . d e s c r i b e ( )

129 summary . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/DbDump/ lasoa summary . csv ’ )

130 index . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08

D i s s e r t a t i o n /03 Data/DbDump/ l s o a c l a s s i f i c a t i o n . csv ’ , index=

False , i n d e x l a b e l=False )

131 psq l . execute ( ”DROP TABLE IF EXISTS london index . l s o a l a b e l s

CASCADE; ”

132 ”CREATE TABLE london index . l s o a l a b e l s ( ”

133 ” gid int , ”

134 ” l soa11cd text , ”

135 ” wkb geometry geometry , ”

136 ” lsoa name text , ”

137 ” median income 2012 13 int , ”

138 ” peop le per sq km f l o a t , ”

139 ” mid 2014 populat ion int , ”

140 ” g s i f l o a t , ”

141 ” f s i f l o a t , ”

143 ” l a b e l varchar (255) ) ; ”

144 ”DROP TABLE IF EXISTS london index . l s o a 11 i n de x

CASCADE; ” , eng ine )

145 psq l . execute ( ”copy london index . l s o a l a b e l s from ’/ Users / ducc ioa

/CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03 Data/DbDump/

l s o a c l a s s i f i c a t i o n . csv ’ CSV HEADER; ” , eng ine )

146 pr in t ( ’ done ’ )

147 psq l . execute ( ”ALTER TABLE london index . l s o a l a b e l s ADD PRIMARY

KEY ( gid ) ; ” , eng ine )

148 pr in t ( ’ done ’ )

149 psq l . execute ( ”CREATE INDEX l s o a l a b e l s s p a t i a l i d x ON

london index . l s o a l a b e l s USING g i s t ( wkb geometry ) ; ” , eng ine )

150 pr in t ( ’ done ’ )

Listing 13: Data classification income.py

BENVGSA2 MSc Smart Cities and Urban Analytics UCL Urban ...

Documents

MSc Programme in Urban Management and Development

MSc Program in Urban Management and Development Thesis...

Cognitive and Decision Sciences MSc - ReportLab ·...

MSc Hydrographic Surveying Brochure 2013 - · PDF fileFor...

MSc in Surveying - Pages - UCL · PDF fileThe MSc in...

Risk and Disaster Science MSc - UCL

2019 Urban Studies Final Version - UCL Discovery

Walking and rhythmicity: sensing urban space - UCL...

MSc Thesis Environmental Technology MSc Thesis Urban

MSc Programme in Urban Management and Development ...

MSc Programme in Urban Management and Development Thesis

UCL Queen Square Institute of Neurology Student Handbook...

The Urban Geology of UCL and the University of London’s...

Clinical Examination Paul Thawley BSC (Hons) MSc (Sports...

DIT Urban Regeneration Msc 'Some Feminist Thoughts on Urban....

Dental Public Health MSc - ucl.reportlab.com · Dental...