BENVGSA2 MSc Smart Cities and Urban Analytics UCL Urban ...
Post on 05-May-2022
1 Views
Preview:
Transcript
BENVGSA2
MSc Smart Cities and Urban Analytics
UCL
Urban form and density:
A typo-morphological classification of
London’s urban landscapes
Duccio Aiazzi
March 8, 2017
1
This dissertation is submitted in partial fulfilment of the requirements
for the degree of Master of Science Smart Cities and Urban Analytics from
University College London.
I, Duccio Aiazzi confirm that the work presented in this thesis is my own.
Where information has been derived from other sources, I confirm that this
has been indicated in the thesis.
Word count: 11100
2
All Ordnance Survey data and the maps derived therefrom: Crown copy-
right 2016. All rights reserved.
3
Abstract
Urban density is widely used to study the city and it is often related
with the urban form. The use of an all-encompassing average measure
to describe the complexity and the variety of the urban landscape has
been questioned by several studies (Alexander 1993, Churchman 1999,
Martin & March 1972) and various authors have proposed the use
of multi-dimensional classifications to better address the problem of
describing and prescribing the urban form. In this work, I present the
results of a typo-morphological classification of neighbourhoods across
the boundaries of Greater London, based on the multi-dimensional
density measure developed by Berghauser Pont & Haupt (2010). The
aim of the work is to critically asses the validity of the method in
delivering a functional classification of different urban landscapes and
understanding how these are associated with population density and
income, by looking at Greater London. Another major point is to
develop an automated method that can cope with storage, analysis
and visualisation of large extents.
4
Contents
1 Overview 9
1.1 Motivations and research goal . . . . . . . . . . . . . . . . . . 9
1.2 Structure of the study . . . . . . . . . . . . . . . . . . . . . . 10
2 The context 12
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Urban morphology and typo-morphology . . . . . . . . . . . . 16
2.3 Urban science and generative design . . . . . . . . . . . . . . 19
2.4 Urban density . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 Definitions of physical density measures . . . . . . . . 22
2.4.2 Perceive density . . . . . . . . . . . . . . . . . . . . . 25
2.4.3 The problem with density measures . . . . . . . . . . 26
2.5 Multi-dimensional approaches to urban form . . . . . . . . . 28
3 Methodology 29
3.1 Conceptual framework . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Definitions of parameters . . . . . . . . . . . . . . . . . . . . 32
3.3 Definition of classes . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5 Data storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Data cleaning and manipulation . . . . . . . . . . . . . . . . 44
4 Analysis 47
4.1 The scale of aggregation . . . . . . . . . . . . . . . . . . . . . 47
4.2 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.1 Case study I - Angel . . . . . . . . . . . . . . . . . . . 51
4.2.2 Case study II - Bank . . . . . . . . . . . . . . . . . . 52
5
4.2.3 Case study III - East Croydon . . . . . . . . . . . . . 53
4.2.4 Case study IV - Emerson Park . . . . . . . . . . . . . 54
4.2.5 Case study V - Swiss Cottage . . . . . . . . . . . . . . 55
4.2.6 Case study VI - Other observations . . . . . . . . . . . 56
4.3 Classification summary . . . . . . . . . . . . . . . . . . . . . . 59
4.4 Classes of neighbourhoods and income distribution . . . . . . 61
4.5 Classes of neighbourhoods and population density . . . . . . 65
5 Conclusions 67
6 Appendix 76
6.1 London data: overview maps . . . . . . . . . . . . . . . . . . 76
6.2 Classification results: overview maps . . . . . . . . . . . . . . 79
6.3 Case studies: detailed maps . . . . . . . . . . . . . . . . . . . 83
6.3.1 Case study I: Angel . . . . . . . . . . . . . . . . . . . 84
6.3.2 Case study II: Bank . . . . . . . . . . . . . . . . . . . 87
6.3.3 Case study III: East Croydon . . . . . . . . . . . . . 90
6.3.4 Case study IV : Emerson Park . . . . . . . . . . . . . 93
6.3.5 Case study V : Swiss Cottage . . . . . . . . . . . . . . 96
6.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.1 Data import . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.2 Data cleaning . . . . . . . . . . . . . . . . . . . . . . . 113
6.4.3 Classification . . . . . . . . . . . . . . . . . . . . . . . 123
6.4.4 Income and population density . . . . . . . . . . . . . 128
6
List of Figures
1 What are the specificity of the local culture? . . . . . . . . . 9
2 The White Horse Tavern, located in New York City’s borough
of Manhattan at Hudson Street and 11th Street, in the 60s. . 13
3 Le Corbusier’s vision for the city of the future (Corbusier 1964). 14
4 Scale model of Le Corbusier’s Plan Voisin (Corbusier 1964). . 15
5 The hierarchy of components in the urban fabric. . . . . . . . 16
6 The hierarchy of components in the urban fabric. . . . . . . . 18
7 Perceived density. Contributing factors Alexander (1993). . . 25
8 Three blocks with the same density of 75 dwellings units per
hectare (Per 2008) . . . . . . . . . . . . . . . . . . . . . . . . 28
9 Distribution of building count aggregated by block. . . . . . . 30
10 Space matrix diagram. . . . . . . . . . . . . . . . . . . . . . . 35
11 Correlation plot between FSI and GSI. . . . . . . . . . . . . . 36
12 GSI v number of floors and class definition. . . . . . . . . . . 37
13 GSI distribution of blocks. . . . . . . . . . . . . . . . . . . . . 38
14 Count of the classes by GSI thresholds. . . . . . . . . . . . . 40
15 Street network and street blocks. . . . . . . . . . . . . . . . . 42
16 Classification method Bl: London overview. . . . . . . . . . . 47
17 Classification method Bl: close-up on the city centre. . . . . 48
18 Case study locations. . . . . . . . . . . . . . . . . . . . . . . . 50
19 Bird’s eye view of Angel. . . . . . . . . . . . . . . . . . . . . . 51
20 Bird’s eye view of Bank. . . . . . . . . . . . . . . . . . . . . . 52
21 Bird’s eye view of East Croydon. . . . . . . . . . . . . . . . . 53
22 Bird’s eye view of Emerson Park. . . . . . . . . . . . . . . . . 54
23 Bird’s eye view of Swiss Cottage. . . . . . . . . . . . . . . . . 55
24 Classification method Bl: close-up on the Wembley area. . . 57
7
25 Classification method Bl: close-up on the Enfield area. . . . . 57
26 Classification method Bl: close-up on the area west of Angel. 58
27 Average FSI by classes of urban typology. . . . . . . . . . . . 59
28 Summary of the income by classes of urban typology. . . . . . 60
29 Blocks counted by class of urban typology. . . . . . . . . . . . 61
30 FSI vs income by classes of urban typology. . . . . . . . . . . 62
31 FSI v median income by borough. . . . . . . . . . . . . . . . 64
32 Population density by classes of urban typology. . . . . . . . 65
33 Built environment density (FSI) v population density. . . . . 66
34 London data: building heights . . . . . . . . . . . . . . . . . . 76
35 London data: population density by LSOA . . . . . . . . . . 77
36 London data: median income by LSOA . . . . . . . . . . . . 78
37 Classification method Bl: London overview . . . . . . . . . . 79
38 Classification method Bl400: London overview . . . . . . . . 80
39 Classification method Pl: London overview . . . . . . . . . . 81
40 Classification method Pl150: London overview . . . . . . . . 82
41 Case study locations . . . . . . . . . . . . . . . . . . . . . . . 83
42 Angel: Satellite view and building heights . . . . . . . . . . . 84
43 Angel: Method Bl by block and Bl400 by block in range 400
m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
44 Angel: Method Pl by plot and Pl150 by plot in range 150 m. 86
45 Bank: Satellite view and building heights . . . . . . . . . . . 87
46 Bank: Method Bl by block and Bl400 by block in range 400
m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
47 Bank: Method Pl by plot and Pl150 by plot in range 150 m. 89
48 East Croydon: Satellite view and building heights . . . . . . . 90
8
49 East Croydon: Method Bl by block and Bl400 by block in
range 400 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
50 East Croydon: Method Pl by plot and Pl150 by plot in range
150 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
51 Emerson Park: Satellite view and building heights . . . . . . 93
52 Emerson Park: Method Bl by block and Bl400 by block in
range 400 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
53 Emerson Park: Method Pl by plot and Pl150 by plot in range
150 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
54 Swiss Cottage: Satellite view and building heights . . . . . . 96
55 Swiss Cottage: Method Bl by block and Bl400 by block in
range 400 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
56 Swiss Cottage: Method Pl by plot and Pl150 by plot in range
150 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9
List of Tables
1 Class definition summary. . . . . . . . . . . . . . . . . . . . . 41
10
Listings
1 Data import OS TopographicLayer.py . . . . . . . . . . . . . 99
2 Data import InspireCadastralPlots.py . . . . . . . . . . . . . 101
3 Data import OS ITN.py . . . . . . . . . . . . . . . . . . . . . 103
4 Data import BuildingHeights.py . . . . . . . . . . . . . . . . 106
5 Data import LondonAdministrativeBoundaries.py . . . . . . 108
6 Data createTopology.sql . . . . . . . . . . . . . . . . . . . . . 113
7 Data JoinBuildingHeights.sql . . . . . . . . . . . . . . . . . . 117
8 Data createBuildingShapesTable.sql . . . . . . . . . . . . . . 118
9 Data joinBlocks.sql . . . . . . . . . . . . . . . . . . . . . . . . 119
10 Data joinBlockRange400.sql . . . . . . . . . . . . . . . . . . . 121
11 Data classification blocks.py . . . . . . . . . . . . . . . . . . . 123
12 Data joinIncomeTable.sql . . . . . . . . . . . . . . . . . . . . 128
13 Data classification income.py . . . . . . . . . . . . . . . . . . 129
11
1 Overview
1.1 Motivations and research goal
The present study is set in the context of urban planning and design with ref-
erence to the field of typo-morphology. Because of their nature of complex
systems, it is debatable whether a city is more designable than a society
or an ecosystems. Marshall (2009) talks about cities as both “designed”
and “organic”. This means that, at the same time, cities evolve uncon-
trolled and are shaped by a conscious effort, emerge from local dynamics
but also from deliberate actions of creation. In both cases, the question
about the optimal urban form is more about finding a balance between dif-
ferent types of neighbourhoods rather one single unifying solution. It is
also a question of understanding the specificity of local cultures and existing
situations. In order to do so, whether for prescription or description, it is
useful to classify the urban typo-morphology and more so using a quanti-
tative approach that tries to bring objectivity in a field often dominated
by qualitative approaches and diverging schools of thought (Moudon 1994).
Image taken from the webpagehttp://www.masterplanningthefuture.org/?p=2700
Figure 1: What are the specificity of
the local culture?
Simplicity should also be paramount,
when there is need to overcome ex-
cessive simplifications in the prac-
tice of urban studies derived by
the omnipresent use of a one-catch-
all index such as urban density,
which is used for its immediate-
ness but often imprecisely and with-
out a shared definition (Churchman
1999).
12
In the following pages, I present the results of a typo-morphological classi-
fication of neighbourhoods across the boundaries of Greater London, based
on the multi-dimensional density measure developed by Berghauser Pont &
Haupt (2010). The aim of the work is to critically asses the validity of the
method in delivering a functional classification of different urban landscapes
and understanding how these are associated with population density and
income. Another major point is to develop an automated method that can
cope with storage, analysis and visualisation of big amount of data in order
to be able to apply the method over large geographical extents.
1.2 Structure of the study
Section 2 provides an introduction to the context by describing the main
references the form the base of this work. It starts by schematically outlin-
ing the debate around the urban form, then describing the basic concepts
of urban typo–morphology, which is the discipline that deals with the de-
scription and classification of the urban form. It also summarises the recent
surge of interest in generative processes and computational methods, and
finally defines the concept of urban density, some of the debate surrounding
it and some proposed alternatives.
The first half of Section 3 describes the rationale behind this study and
the methodology of investigation. The second part deals with the technical-
ities: the data sources, the tools used for the analysis, the workflow and the
manipulation of the data.
In Section 4 I apply the classification to Greater London. The section
first discusses the different scales of aggregation, their merits and problems.
Then, it describes in detail five case studies giving an interpretation of the
results, it summarises the results by classes and finally analyses the relation-
13
ship between the classification and two demographic measures: population
density and median income.
Section 5 summarises the results and describes the limitations.
Appendix contains the maps that were not included in the main text and
the Python and SQL code used to perform the storage, cleaning and analysis
of the data.
14
2 The context
2.1 Introduction
There is a growing consensus on defining the city by the processes happening
within its location rather than by the physical environment. Batty (2013)
suggests that “instead of thinking of cities as sets of spaces, places, locations,
we need to think them as sets of actions, interactions and transactions” and
that “location is, in effect, a synthesis of interactions”.
In the city multiple interacting systems transfer flows of information and
goods back and forth, at different time scales, leading to constant changes
in the environment. Depending on the system, changes happen more or less
rapidly: information and people follow fast dynamics, land-uses changes less
frequently and the infrastructures follow even slower dynamics. A key el-
ement in understanding the city is the interaction between these slow/fast
dynamics, which is the mutual relationship between the processes and the
built environment. In order to respond to the optimisation question posed
by the process of managing, planning and designing parts of cities, it is
important to understand how the built environment shapes and is shaped
by the flow of actions. In other words, the interaction between form and
function.
This is, of course, an old question which according to Schumacher (2011)
defines the very essence of architecture and urban design. It is a question
that has been addressed in the past by looking at the city as a deterministic
machine, one that can be tuned and shaped around men’s actions, follow-
ing a reductionist approach well synthesised in the statement “form follows
function” of the modernist movement in architecture and urban design. Le
Corbusier and the other major protagonists of the modernist school realised
15
the importance of flows and actions in a city and designed grand plans stud-
ied to respond to deterministic problems. They failed, however, at dealing
with the complex nature of the urban systems by ignoring the other side
of the relationship: how the form affects the function. The rigorous zoning
of residential and commercial areas, their separation and the uniformity of
the interventions lead to the reduction of the complexity of the urban envi-
ronment which in turn resulted in dormitory neighbourhoods disconnected
from the daily productive activities. By trying to tame the inherent chaos
of the cities, they did not realise that the chaos was exactly what gives cities
life.
“The stretch of Hudson Street where Ilive is each day the scene of an intricatesidewalk ballet” Jacobs (1961).
Figure 2: The White Horse Tavern,
located in New York City’s borough
of Manhattan at Hudson Street and
11th Street, in the 60s.
In complete opposition to the sani-
tising plans of the modernist era,
Jacobs (1961) describes a city where
street life is the catalyst of the ac-
tivities that make a city successful
and diversity is what fuels the pro-
cess. In spite of the mainstream
idea of her times of urban “hy-
giene” that wanted a city defined by
rigorous zones of uniform land use
(Sir Howard 1898, Corbusier 1964),
she argues that diversity and self-
organising complexity are the defin-
ing features of the urban environment and its ultimate goal. She carefully
draws a pragmatic picture of the street life, where small decisions affected
by apparently meaningless constrains multiply their effect by aggregation
16
and interaction to give form to what she calls the urban ballet. Her work
gradually gained momentum and her influence can be seen in several move-
ments that called for a city modelled on the human scale and on small
pedestrian movements. To name a few, the Compact City (Dantzig & Saaty
1973, Williams et al. 1996), the Smart Growth (Maryland Department of
Planning 1997) or the New Urbanism (Leccese & McCormick 2000).
Figure 3: Le Corbusier’s vision for the city of the future (Corbusier 1964).
These two radically opposing views of what the city is, differ also in
terms of the ingredients that contribute to its success. The form of the
built environment, the relationship between its elements and the context
is one key aspect of these differences. The modernist city is based on
fast transportation where the street is the mean to reach the destination,
not much of a place on its own. In its more extreme version, the mod-
ernist city is spread out to give room to green parks scattered with semi-
autonomous residential units provided with all the basic features and ser-
vices. It is also a city strictly designed with a top-down process with clear
pre-determined zoning. The image where Le Corbusier illustrates the plan
“ville radieuse” for Paris (figure 4, p.15), which seems to suggest the gesture
of a man literarlly single-handedly donating a new future to the city. In
17
Jacobs’ view, instead, the path to a successful city goes through a healthy
street life made of a constant movement of people incentivised by mixed
and multiple activities happening along the street and inside the buildings.
Figure 4: Scale model of Le Corbus-
ier’s Plan Voisin (Corbusier 1964).
For street life to happen cities must
achieve a certain density threshold
that can sustain the diversity of ac-
tivities and the interaction between
people. The city has to be walkable,
composed with short blocks, have
a hierarchy of services (from local
to primary functions for the whole
city), buildings of different age and
condition. This view, though, is not presented as a unique, unifying model
but as guidelines whose main purpose is to achieve multiplicity of choice for
the users and in the built environment as well.
18
2.2 Urban morphology and typo-morphology
“Typo–moprhological studies revel the physical and spatial structure of cities.
They are typological and morphological because they describe urban form
(morphology) based on a detailed classification of buildings and open spaces
by type (typology)” (Moudon 1994).
Moudon (1994) identifies three founding schools of the discipline centred in
Italy, England and France, each of them addressing different issues with dif-
ferent methodologies. The main protagonist of the three schools are respec-
tively S. Muratori (Muratori 1960), M.P. Conzen (Conzen 1978) and the
school of Versailles with the architect Jean Castex, the urbanist Philippe
Panerai and the sociologist Charles Depaule.
From Patricios (2002), the hierarchical structure of the city. (a) Enclave, (b) block, (c)superblock, (d) neighbourhood.
Figure 5: The hierarchy of components in the urban fabric.
The first element of typo-morphology is the open space of the city, the
non-built one (the road, the park, the square, etc.). The second element is
19
the building with its geometric properties. These two elements combine in
different types of urban landscapes. The link between the two is the parcel
or plot, which is the basic element of the urban fabric. The urban fabric is
the set of properties that define these elements and their relationship with
each other. The urban fabric links the building scale and the city scale,
through a hierarchical aggregation of the basic units into enclaves, blocks,
super–blocks and neighbourhoods. These elements and their layout describe
the morphology (form) of the city.
On the other hand, the typology is defined by the buildings. A schematic
view of the building typologies is given by L. Martin and L. March in their
seminal work ”Urban space and structure”(Martin & March 1972). They
define three broad types of built form (figure 6, p.18) that differ in the shape
of the building and its relationship with the plot. The pavilion represent an
isolated building with a small footprint that achieves high density with a
higher number of storeys. The court is the type that, given the constrain of
access to natural ventilation and to daylight, achieves the highest coverage of
the plot. The street is a typology in between. These three categories have
been the basis of several studies of urban typo–morphology and of other
disciplines (see as examples Alexander (1993) on density and Ratti et al.
(2005) on the environmental performances of different typologies).
20
The three dispositions are described respectively as: the pavilion or tower, the street orslab, the generating cruciform in a continuous pattern of courts.
Figure 6: The hierarchy of components in the urban fabric.
21
2.3 Urban science and generative design
The other important factor that sets the context of this research is the re-
cently revived focus on a quantitative approach to urban planning and to
architectural design. In general, the last decade saw a spike in the computa-
tional power cheaply available, thanks to improved chips but also to Internet
and clouding services. There has been also a flood of available data coming
from mobile devices that turned the attention to the possibilities offered
by data analysis techniques. On the side of urban planning, for example,
Moudon & Lee (2009) call for an “Urbanism with numbers” and produce a
case study where socio-behavioural data can be linked with built environ-
ment parameters to make a case for the use of numbers in urban planning.
Another example is the space syntax, a set of theories and methods for
the analysis of spatial configurations, that has been around since the 70’s
(Hillier 2007) and recently found new vigour with the spike in computational
capacity of computers. In the architectural theory, the last decade saw the
flourishing of parametric design (Schumacher 2009), developed initially by
the architect Zaha Hadid and academics at the Architectural Association
in London. Parametric or generative design is a design process that re-
lies on algorithms to generate geometries from the definition of a family of
initial parameters and the formal relations they keep with each other. It
is a design process that deals with the parameters that generate the form
rather than the form itself. It was originally developed in aerospace and
automotive industries, where the focus is very much on the performance.
In the professional practice, architectural offices together with all the other
agents involved in the planning, design and management of buildings and
infrastructures use a rather new tool for delivering and sharing information:
Building Information Modelling (BIM) are database systems to store and
22
share digital representations of buildings (Azhar 2011), which include the
geometry as well as materials, technical specifications, etc. all in one single
model.
If urban morphology is to be of support to urban planning and urban design,
there is a need to link different types of city form to their socio-economic
performance and to wellbeing statistics: this means that there is a need to
quantify urban forms in a way that can be scaled up in order to study large
systems.
2.4 Urban density
The debate about the urban form saw an intensification in the last decades
also in response to the uncontrolled urban expansion and the contempo-
rary realisation of the environmental impact of urban life-style. Since the
phenomenon of urban sprawl has been recognised and defined (Real Es-
tate Research Corporation 1974, Burchell et al. 1998, Jackson 1985, Ewing
1997), several studies have been produced that try to understand the rela-
tion between urban form on one side and socio-economic and environmental
parameters on the other. One constant of this debate is the use of density
as an all encompassing summary for the urban form.
Density concerns the urban form because it is a first, coarse approximation
commonly used for its simplicity as a prescription in the planning process,
mostly as a useful measure to describe the load on services and infrastruc-
tures but also because of its supposed relationship with the urban land-
scape. Churchman (1999), in her review of the use of urban density, found
that many disciplines make use of the concept. These include “planning,
urban design, architecture, environment-behavioral studies, transportation,
economics, sociology, psychology, anthropology, and ecology”.
23
Alexander (1993) identifies four main fields where urban density is exten-
sively used:
Psychological and Behavioural Studies Mainly concerned with the perception of crowd-
ing, privacy, territoriality and so on (see Krupat (1985) for a literature review).
Land and Urban Economics How density affects the efficiency of the city in terms of
transportation, land consumption, business, etc. For a brief description, see
Section ??
Planning Normative regulations involving density to control urban development.
Density and Urban Form In urban morphology several studies address the question
of what is the relation between density and urban form.
In land and urban economics, for example, transportation is one the most
discussed. On the economics of scale for transportation systems, Priest
(1977), Frank (1989), Burchell & Listokin (1995) link higher density to less
expensive infrastructures. CENTRO (2012) explore the impact of density
on the use of public transportation suggesting that compact forms reduce
the need for traveling. Cervero & Guerra (2011) analyse the density thresh-
olds required to make urban transport economically viable. On the use of
private transportation, Newman and Kenworthy have an extensive body of
work linking higher density with lower gasoline consumption (Newman &
Kenworthy 1989a,b, Newman 1992, ECOTEC 1993), but their conclusions
remain controversial (Gordon & Richardson 1997, Stretton 1996). Melia
et al. (2011) suggest that, although compact cities incentivise the use of
public transport, they are also prone to higher traffic congestion.
In terms of environment, there is unanimous consensus on the fact that
high densities reduce the need for agricultural and wild land (Burton &
Matson 1996, Alterman 1997), although land protection has been criticised
24
by Gordon & Richardson (1997) on the base of the market economy, be-
cause restrictions on development force land into lower valued uses. They
also argued that land at the moment is not scarce in the U.S. and in the
world and that food production has been steadily on the rise in the past
decades, therefore, at the moment, there is no need for restricting land use.
Overall, the current prevalent view is that a certain level of population den-
sity is required for various services to be cost effective and this is reflected
in the work of several institutions (Commission of the European Communi-
ties 1990, United Nations 1992, National Research Council 1999, American
Planning Association 1999, EEA 1999) and in various planning guidances
across the world (Greater London Authority 2006, UK Government 1994,
OECD 2012).
2.4.1 Definitions of physical density measures
Density is a point measure of a quantity normalised by some area or volume
it occupies. Within the urban context, density is used to describe rela-
tionship between the surface of an aerial unit A, such as a city block, a
neighbourhood or a whole city, and a quantity q such as building floor area,
population or residential units. Mathematically, it is simply described as:
d =q
A
When comparing different measures of density, the nominators can be con-
verted between each other sometimes explicitly, sometimes making assump-
tions (e.g. building floor area can be converted to residential units assuming
an average size of the dwellings). Depending on the assumptions, the con-
version can be more or less accurate.
25
The denominator may varies in two respects: the unit of measurement and
the definition of boundaries. The first case is just a matter of arithmetical
conversion. The second case, also known as modifiable areal unit problem,
abbreviated as MUAP (section 2.4.3, p.26), can cause a great deal of am-
biguity and lead to incomparability in the results of different studies. It is
itself a subfield of study related to various disciplines (Dark & Bram 2007).
Density measures can be categorised by their nominator (Berghauser Pont
& Haupt 2010):
Population and dwelling density The first one is expressed as the number of people
leaving in an area, the second one as the number
of dwellings units. As social transformations are
usually faster than transformations to the building
environment, population density is subject to a
higher variability across time. They are both used
to plan for services and infrastructures.
Land Use Intensity Measures the ratio between the total floor space
(on all floors) in an area and the total surface of
the area. It is known as Floor Space Index (FSI) in
Europe and Floor Area Ratio (FAR) in the U.S.A.
As we can see from the definition, land use inten-
sity does not make use of population and it in-
cludes all sort of land use, not residential only.
For this reason it gives a better estimation of the
urban form and it is more suitable at describing
mixed-use areas then the previous methods.
Coverage Or Ground Space Index (GSI) is the ratio between
the area of the footprint of the buildings and the
area of the site.
Height Building height, expressed in total number of floors
or length units. Although not a proper measure
26
of density, building height affects the way the en-
vironment is perceived and it is also related to
cultural factors (in UK, more than in other coun-
tries, strong is the stereotype of home as as ter-
raced house two storey high).
Spaciousness Or Open Space Ration (OSR) is the ratio between
open space area and total floor surface.
A further categorisation can be made depending on the denominator (Alexan-
der 1993):
Net Dwelling Density (NDD) The nominator may be any quantity of population
or residential units suggested above. The denomi-
nator is ”the total land area devoted to residential
facilities”. This includes any private amenities area,
parking and access driveways but excluded are any
commercial activities, parking and local businesses
not directly below the dwelling structure, public
parks, institutions, schools and public streets.
Gross Residential Area (GRD) The nominator is the same as above, the denomi-
nator is ”the gross residential site area”. This in-
cludes ”the net residential area + half the area of
the perimeter roads + one quarter of the area of the
intersections”.
Neighbourhood Density (ND) Tthe number of population, dwelling units, etc. per
unit of area of the total neighbourhood land. It
includes all the major services related to the neigh-
bourhood but it excludes services and institutions
that serve the whole city and above.
City Density (CD) This is the ratio between a chosen quantity as above
and the area of the whole city. One of the main
27
source of ambiguity that can affect comparability
across cities and the interpretation of the index is
the definition of the boundaries. It can be defined
as the administrative boundaries but these are often
arbitrary and can include rural and semi-rural areas.
2.4.2 Perceive density
Alexander (1993) extends the concept of density to qualitative attributes in
order to describe the perceived density, that is how the built environment
is perceived by the users. The rationale behind this extend definition of
density is that, ultimately, the goal of disciplines such as urban planning,
architecture and urban morphology is to shape, analyse and predict the built
environment so that it is at the same time functional to all the urban activ-
ities (e.g. commerce, services, transport, leisure) and perceived as pleasant
and enjoyable by the users. Therefore it is necessary to quantify not only
physical measures but also qualitative attributes
Figure 7: Perceived density. Contributing factors Alexander (1993).
28
Perceived density is a definition that tries to catch the physical element
of the building environment as well as the way this affects the citizens. Per-
ceived density results from the interaction of three factors (figure 7, p.25):
physical density, individual cognitive factors and socio-cultural ones. In-
dividual factors are hard to measure on a large scale and can hardly be
incorporated in a quantitative descriptor that aims at summarising charac-
teristic of a city. Socio-cultural factors can incorporate norms and standards
of a specific location and depends on the “homogeneity or heterogeneity of
the users of the environment, presence or absence of socio-culturally regu-
lated norms of interaction, levels of social interaction and the character of
activities in relevant setting”.
Physical density itself contains measured density and “qualitative density”.
The latter includes “those aspects of physical density that cannot be mea-
sured and is generated by other relevant physical factors such as design
diversity, scale, etc.” (Alexander 1993). Perceived density, therefore, can
constitute the broader framework where a classification of the building envi-
ronment based on physical density could be linked to all the other parameters
that affect the perception and the fruition of the built environment.
2.4.3 The problem with density measures
In view of this widespread use of density, it is certainly surprising the lack of
a common definitions. Churchman (1999), in her overview on the subject,
found no shared definition across studies even within the same discipline.
It is also surprising that some of these studies, which have been the basis
of years of debate, have paid little attention in the definition of one of the
key parameters, up to the point that seminal works such as ”The cost of
sprawl” (Real Estate Research Corporation 1974, Burchell et al. 1998) have
29
been criticised for the loose use of density by Windsor (1979), who suggested
that the results are actually the opposite.
The two main problems affecting density measures are the previously men-
tioned MUAP and the fact that density is an average measure. The first
problem poses serious question on the validity of various results, because
depending on the definition of the boundaries, results can differ wildly.
Boundaries vary also over time and this makes comparing results across
long time spans difficult. A recent attempt which addresses the problem of
arbitrariness of the boundaries and their variation in time has been done in
Arcaute et al. (2013), by defining an universal law for the city boundaries
using the properties of the road network. This approach has proven valid in
dynamically defining the boundaries of whole cities. Another similar way of
addressing the MUAP problem for the aerial unit in density measures has
been tested in Pont & Marcus (2014) by using a type of location measure.
Location measures define the boundaries by setting the position of a loca-
tion and then calculating the amount of accessibility to a parameter of choice
(this can be the number of services, number of jobs, degree of a network,
etc.). The process can be reversed by setting the accessibility threshold and
calculating the radius necessary to achieve that accessibility. The result is
a dynamic aerial unit for each part of the city. The second problem is in-
extricably connected with the need for simplification: an all-encompassing
index that can at one glance describe a complex landscape is very appealing
but the price is the loss of detail in favour of averages. Other problems are
related to the temporal dimension: cities shows huge variations in popula-
tion, and therefore in density, across different times of the day, due to people
commuting to work. Being density a fixed index, it completely misses these
type of dynamics which are crucial to the understanding of the city.
30
Figure 8: Three blocks with the same density of 75 dwellings units per hectare (Per
2008)
In urban morphology, one main issue is that a mere measure of physical
density is not enough to encompass all the different variations and possibil-
ities. The same single value of a density measure can be obtained with very
different building configurations. Figure 8 (p.28) illustrates the concept by
presenting three residential schemes that achieve the same dwelling density
one with a neighbourhood of terraced houses, one with apartment blocks
around a courtyard and another with a tower within a park and a parking
lot.
2.5 Multi-dimensional approaches to urban form
In their book ”Space, density and urban form” Berghauser Pont & Haupt
(2010) compare the different methods to measure density as defined in Sec-
tion 2.4.1 and analyse how they perform in describing the urban form. They
found that population or dwelling density are poor performers and, although
FSI is a better indicator, none of the methods above maintain an univocal
relation between a type of form and for the same value there are always sev-
eral associations. They propose the use of a multi-variable density concept
composed of measures of physical density of the build environment in order
31
to be able to classify different urban typologies. The variables included in
the index are: Floor Surface Index (FSI), Ground Space Index. Martin &
March (1972) are amongst the first at trying to quantify the geometrical and
relational characteristics of different the spatial layouts for the purpose of
classification. Another example is the systematic gathering and analysis of
a set of parameters of several neighbourhoods in Switzerland by the CETAT
(1986). Amongst the parameters, they collect FSI, GSI, volumes, ratios of
sideways, pedestrian areas, parking, green areas, etc. More recently Gil et al.
(2012) presented a systematic and mutli-dimensional method of description
and classification based on typo-morphology, that makes use of data mining
techniques. The type of parameters used are similar to the Swiss study, to
which they add street network centrality measures degree, closeness and
betweensess and block orientation. These parameters are then classified
using k −means analysis to obtain six classes of blocks and four of streets.
3 Methodology
3.1 Conceptual framework
The rationale behind the present analysis is to explore an effective way
of using parameters that can be derived from the geometry of the city,
hence available as GIS databases, in order to understand the urban typo–
morphology. Using as a starting point the work done by Berghauser Pont
& Haupt (2005) on their space-matrix, I aim at defining the geometrical
properties of the buildings and their relationship with the adjacent environ-
ment that are functional to a typological classification of neighbourhoods.
The key questions that this study tries to address are what are the differ-
ent types of neighbourhoods, what are their defining geometrical features
32
and the building density, that is FSI, that they deliver. At the end of this
study, I also analyse the relation between neighbourhood typologies with
income and population density. As described in detail in section 3.2, I use
the Ground Space Index (GSI) and the building height (L) to assign a class
to each areal unit (defined below) across the territory of Greater London.
The classes are manually defined by setting the thresholds along the GSI
and the number of storeys: the classes are derived by the combination of
three intervals on GSI and four intervals on L, for a total of twelve classes.
Figure 9: Distribution of building count aggregated by block.
Four different classifications are proposed, each differing in the definition
of the areal unit. The first method, referred to as Bl, uses the city block,
defined as the smallest area enclosed by a loop of roads. A second method
Pl uses the cadastral parcel, or plot. A third way, Bl400 and Pl150, consists
33
in using a range r to aggregate the parameters. As shown in figure 9 for
method Bl, areal units often contain more than one buildings, therefore the
parameters are calculated by aggregating the buildings contained in each
unit. The advantage of this aggregation is that the character of an area is
given by the sum of its components, rather than the single element. On
the other hand, as always when generalising, some important details might
get lost. The use of plots in method Pl achieves a more granular approach
because plots generally contain just one or very few buildings. The third
method helps overcome a problem that affects both the previous ones. The
experience of the environment perceived by the user moving across the city
is defined by the buildings along the street. It should be therefore an average
of the characteristics of the two sides. Both the first two methods look at
the two sides of the roads separately, whereas using a range to aggregate
the parameters allows to take into account area averages that include both
sides. To summarise, the four methods of aggregation will be referred to as
below:
Bl aggregation by street block.
Pl aggregation by cadastral plot.
Bl400 aggregation by street block in a range. The length of 400 m has
been chosen as the standard walkable catchment for the day-today
facilities, used in literature.
Pl150 aggregation by cadastral in a range of 150 m In this case the length
has been chosen by a trial and error refinement to find the right
balance between aggregation and detail.
In Berghauser Pont & Haupt (2005) the approach has been to select few
different type of neighbourhoods, already clearly defined in terms of typol-
34
ogy and then analyse where they fit within the Spacematrix and how they
cluster. My approach is similar: I first argue about the choice of parameters
for the classification, then define the threshold for each class and run the
classification, and then verify what typology is identified by each class. In
general, in the following description, I will be using method Bl and refer to
the results of the others when these are relevant.
3.2 Definitions of parameters
As described in section 2.4.3, a mono-dimensional index such as density is
not enough to distinguish different urban typologies from each other, as
the same density value can be associated to very different spatial layouts.
Berghauser Pont & Haupt (2005) propose the use of a multi-dimensional
index that takes into account other geometric properties, such as the ratio
of the built area of a plot, its ratio of total floor space built, etc.. Below
I list of the basic components of these properties as they come from the
data source (described in section 3.4, p.41), the definition of the principal
measures used for the classification and the assumptions made in the present
study:
u Areal unit. It is defined, depending on the method, as the street
block (method Bl), the cadastral parcel or plot (method Pl) and
the circumference of the circle of range r (methods Bl400 and
Pl150).
L Number of storeys of a building. In the present study, as the
available OS database gives the building height H in meters, I
assume the average floor to floor height in London to be H = 3m
35
and therefore:
L =H
H
My assumption, of course, is questionable but I was not be able
to find any study related to average floor heights in London. In
my experience, 3m is a good approximation of the standard floor
to floor height in the construction industry, but in practices this
can vary drastically: from the 2.4m standard of new residential
buildings and old working-classes terraced houses, to 3-3.2m of
wealthier period houses, up to 4 or 5m of ground floors of large
commercial premises or office buildings.
Lw Because most of the areal units, let them be blocks, plots or
ranges, include several buildings, the value L of the areal unit
is the weighted average of the buildings using the total floor area
AT as weight. Therefore, the weighted average of the number of
storeys of an areal unit with n buildings can be written as:
Lwu =n∑
i=1
LiATi /
n∑i=1
ATi
Au Surface of the areal unit. In the case of blocks and plots, it is the
area of the geometric shape. In the case of the range it is the area
of the circumference Au = πr2, with r = 400 m for method Bl400
and r = 150 m for method Pl150.
Af Area of the footprint of a building. In the OS data, the footprint
of the building is the building shape in the Topographical Layer.
AT Total floor area of a building. It is calculated by multiplying the
36
footprint area by the number of floors:
AT = AfL
GSI Ground Floor Index. It represents the percentage of built area of
the areal unit u and is calculated by adding up the footprint area
of each of the n buildings within the areal unit and dividing it by
the area of the latter:
GSIu =n∑
i=1
Afi /A
u
FSI Floor Space Index. It represents the percentage of total floor area
compared to the size of the areal unit. It is calculated as the
sum of the footprints times the number of floors of each of the n
buildings within the areal unit, divided by the area of the latter:
FSIu =n∑
i=1
Afi Li/A
u
OSR Open Space Ration or spaciousness. It is a measure of the amount
of non–built space at ground level per square metre of gross floor
area:
OSRu =1 −GSIu
FSIu
The space-matrix (figure 10) is the visual aid to the classification used in
Berghauser Pont & Haupt (2005) and Berghauser Pont & Haupt (2010). On
the two axis x and y are represented respectively GSI and FSI. OSR and
L are gradients that complete the visual information but for the purpose of
the classification and therefore the placement of an area within the graph,
37
are superfluous. In fact, FSI is also redundant and can be replaced with L.
If we look at figure 11 (p.36), GSI and FSI are strictly correlated.
Figure 10: Space matrix diagram.
This is by definition, because FSI in the case of a single building on an
areal unit, for example, is equal to GSI multiplied by L. The fact that in
this context FSI and L are interchangeable makes their use dependent on
the type of dataset available and on the context in which the information
is used, i.e. to whom the results are aimed. In some cases, FSI might be a
known quantity as part of the records kept by the planning authority. In
other contexts, such as the one of this study, it is easier to access the building
height database. Because the building height is part of the raw data , for
the purpose of this study, I will be using Lw, the weighted average number
of storeys of an aerial unit, instead of FSI.
38
Figure 11: Correlation plot between FSI and GSI.
3.3 Definition of classes
The left plot in figure 12 (p.37) refers to the result of the aggregation Bl and
shows the GSI values by block against the weighted average of the number
of storeys, where on the right hand side the same values are overlapped with
the thresholds of the classes, which are used for the classification of the data
that come from each of the four aggregation methods.
39
Figure 12: GSI v number of floors and class definition.
40
The grid divides the space in the twelve cells that represent twelve classes
of urban typologies. The bottom-left cell represents 1-2 storey high buildings
that are scattered in an open land, a sort of rural context. The top-left one
has the same dispersion but very tall buildings, indicating probably residen-
tial towers surrounded by green area. The top-right, instead, are probably
office towers in a highly exploited area, such as the City, for instance. Just
below, in the medium-high - high coverage slot, we can probably expect a
Paris style neighbourhood, with dense blocks and relatively high buildings.
Figure 13: GSI distribution of blocks.
The definition of the thresholds is crucial to the classification work but
also debatable. Urban typologies have a strong subjective component, can
vary amongst different subjects and cultures and their definition is subject
to a certain degree of arbitrariness. For the definition of the building height
41
thresholds (figure 12, right graph), I rely on a mix of facts and observation.
The first class is up to 2 storeys high, as this is a clear feature of most of
the residential units in the UK (Muthesius 1982). The second class, up 6
storeys, corresponds to the upper limit of medium size office buildings and
residential estates. Buildings up to 12 storeys high correspond to residential
towers or high density offices or apartment blocks. Above 12, we are in
the realm of towers. The building height thresholds fall half way between
the integer value marks because they are weighted averages, so in order to
identify the typology up to two storeys high, for example, the threshold is
set to 2.5.
42
Figure 14: Count of the classes by GSI thresholds.
For the GSI thresholds, I tested two methods (figure 13, p.38): the one
in blue is based on three equal length intervals within [0:1], the one in red
is based three equal quantiles (0.33 and 0.66). The resulting classification
of the two options can be observed in figure 14: the division in equal length
intervals returns a very high number of low-rise - low coverage with little
variation in the low-rise category, whereas the division in quantiles produces
a much richer and diverse classification and therefore it is the one selected
for the rest of the study. Table 1 summarises the brackets of the twelve
classes.
43
Table 1: Class definition summary.
0 <GSI <0.18 1.8 <GSI <0.28 0.28 <GSI <1
1-2 storeyslow rise
low coverage
low rise
mid coverage
low rise
high coverage
3-6 storeysmid-low rise
low coverage
mid-low rise
mid coverage
mid-low rise
high coverage
7-12 storeysmid-high rise
low coverage
mid-high rise
mid coverage
mid-high rise
high coverage
above 12high rise
low coverage
high rise
mid coverage
high rise
high coverage
3.4 Data sources
Most of the data comes from the Ordnance Survey (OS) accessed through the
website of Edina, digimap.edina.ac.uk. OS is the national mapping agency
for Great Britain and Edina is a UK-based data provider for educational
purposes aimed at educational staff and students. From their website, it is
possible for students and academics to access the the OS Master Map, OS’s
most detailed products. The map has different layers, corresponding to dif-
ferent types of information. The IntegratedTransportTMNetworkLayer,
contains the map of the entire road network of England, Wales and Scotland
and have been used to calculate the polygons of the street blocks (figure 15,
p.42). The Topographic Layer provides the detailed shapes of the footprint
of the buildings and this is integrated with the Building Heights Layer.
44
The right diagram represents a street network, the left one its negative,the street blocks. They can also be described as the void (the street)and the volume (the built environment), although this description tendsto be less meaningful outside the very dense city centres.
Figure 15: Street network and street blocks.
The database INSPIRE Index Polygons provides the shape of the
cadastral plots and the files of each borough are available on the web-
site https://www.gov.uk/government/collections/download-inspire-
index-polygons. From the same source, I also use the Network Rail data,
which contains the location of each railway station in the UK and the centre
line of the rail tracks. For the definition of the boundaries of Greater Lon-
don, the statistical unit areas and the related data, I use the combined data
of the Office for National Statistics and the OS, as provided by the London
Datastore at data.london.gov.uk.
3.5 Data storage
The data was stored in PostgreSQL and manipulated in PostGIS. Post-
greSQL is an open-source database supporting SQL constructs and PostGIS
is a powerful geometric and geographic extension that provides PostgreSQL
with the possibility of storing geometry and perform spatial operations. Al-
though the same process of storing and manipulating could have been done
45
using just ESRI shape files, the advantage of the database environment is
that the same data can be accessed by different environments via standard-
ised SQL queries and used in web visualisation or other applications. It is
also possible, in case of lack of local resources, to transfer the storage and
the processing to clouding services making the whole process flexible and
easily scalable. Another important tool used is GDAL, a translation library
for vector and raster geospatial data formats, and in particular its function
ogr2ogr, used to import gml files. To import ESRI shape files, I use shp2pgsql
from the command line, which is the PostGIS standard tool for importing
this type of files.
The data is imported from raw gml and csv files or from ESRI files, using
Python scripts to control a mix of terminal commands and SQL queries.
The main workflow consistes in using Psycopg2 and SQLalchemy, two Post-
greSQL adaptors for Python, to create a table in the database and then the
module Subprocess, which allows Python to run terminal commands, to run
ogr2ogr to copy the gml into the table. The import process is executed in
the following steps, each corresponding to a Python script (appendix 6.4.1,
p.99 for the code):
1. The OS Topographic Layer is available in cells of 10x10km, each
containing several smaller cells. Data import TopographicLayer.py
loops through the folders, fetches the name of the files and
builds a command statement to be run in terminal. SQL state-
ments then add primary keys and geometric indexes to the ta-
bles. The created table contains the geometry of the footprint
of the buildings and a building ID that can be connected to the
BuildingHeight database. Data import InspireCadastralPlots.py and
Data import OS ITN.py work with the same principles to import re-
46
spectively the cadastral parcels and the road network.
2. Data import BuildingHeights.py fetches the folder path for each csv,
creates a database table and runs an SQL statement to import the
files in the table.
3. Data import LondonAdministrativeBoundaries.py uses the Phython
module Subprocess to run shp2pgsql in the command line to import
the ESRI shape files of the administrative boundaries.
3.6 Data cleaning and manipulation
A last step in creating the necessary geometry is to calculate the polygons
from the road network geometry. For this purpose, I use the topological
extension of PostgreSQL. In the representation of the morphology of the city
described above, made of streets and blocks, we can use lines to represent
the streets. But once the road network is laid out, creating the polygons is
a redundant exercise, because the information is already there. Topological
representation takes into account the fact the geometric features rarely exist
independently of each other and keeps track not only of the geometry of
the elements, but also of their relationship. For any given geometry, the
topological extension creates four tables to store nodes, edges, faces and their
relationships. Below I describe the steps of the cleaning and manipulation
process (appendix 6.4.2, p.113 for the code):
4. Data createTopology.sql creates the topology tables from the ITN
street network. The import process is broken down by borough be-
cause it is computationally demanding and it was easier to control
eventual errors. For each borough, the network is intersected with the
administrative boundaries and then the topology is created. When
47
a new borough is added, Postgres topology automatically merges its
geometry with the one already imported by removing duplicate edges
and nodes. Once the geometry has been converted in topological ta-
bles, the faces are extracted looping through the ID of the face table
with the function ST GetFaceGeometry.
5. Data JoinBuildingHeights.sql merges the building footprint shapes im-
ported from the OS Mastermap with the relative building heights ta-
ble.
6. Data createBuildingShapesTable.sql creates the table shapes, which is
the main table that stores buildings’ geometry and data.
7. Data joinBlocks.sql, Data joinPlots.sql, Data joinBlockRange400.sql
and Data joinPlotRange400.sql effectuate a spatial join between the
areal units and the centroids of the buildings to count the buildings
falling within the area and calculate aggregate measures and weighted
averages. In particular, they calculate the count of buildings, the
total footprint, the total floor surface, the GSI and its standard varia-
tion, the FSI and its standard deviation, the average number of floors
of the buildings weighted with their total floor surface and the stan-
dard deviation. This process creates four tables block index, plot index,
block400 index and plot150 index with the areal units and relative at-
tributes of methods Bl, Pl, Bl400 and Pl400.
The four tables contain the attributes for each areal unit. The following
Python files use GSI and the weighted number of storeys from these tables
to classify the elements based on the classes defined in section 3.2 (p.32):
9. Data classification blocks.py and Data classification plots.py loop
48
through the areal units to compare GSI and number of storeys and
assign a class of urban landscape to each unit.
10. Data joinIncomeTable.sql and Data classification income.py performs
the join between the LSOA boundaries and the classification tables in
order to be able to relate different urban landscapes to income and
population data.
49
4 Analysis
4.1 The scale of aggregation
Figure 16: Classification method Bl: London overview.
An overview of the results is shown in section 6.2 (p.79) and a detailed vi-
sualisation of the different case studies can be found in section 6.3 (p. 83).
The four overview maps show the classification over the whole Greater Lon-
don administrative boundaries: method Bl in figure 37 (p.79), also shown
in figure 16, method Bl400 in figure 38 (p.80), method Pl in figure 39 (p.81)
and method Pl150 in figure 40 (p.82). In the detailed maps, for each case
study a satellite view of the area is provided, together with maps showing
the building height and the classification by different types of aggregation.
The different types of aggregation clearly convey different scale of infor-
50
Figure 17: Classification method Bl: close-up on the city centre.
mation. Pl gives a very detailed level of information that can be used to
describe single buildings but needs aggregation to a larger level to be able
to describe an area. At the global scale, although the red colours are visi-
ble and therefore it is possible to spot areas with taller buildings, it is not
possible to distinguish between different categories. Bl400 averages far too
much for a detailed view but at the global scale returns a very immediate
overview of the distribution of few clusters of different typologies within a
vast majority of low-rise buildings. There is scarce detail, though, for ex-
ample the whole borough of Islington is composed by just two classes. Bl
and Pl150 return detailed enough information for the close reading of the
case studies, maintaining their readability also at a larger scale.
The overview in figure 16 (p.47) shows quite clearly the hierarchy of the
51
morphology of the city: a well defined core of high density environment
between the City of London and Westminster, with a ring of medium-high
density around, a progressive reduction of the density towards the fringes of
the city, punctuated by clusters of mid-high densities. The core alternates
mid-high rise buildings with areas dominated by towers such as south of
Liverpool Station, Old Street, around the Museum of London and the north
end of Tottenham Court Road (figure 17, p.48). Outside the core, the other
high density areas are Canary Wharf, with a very high concentration of tow-
ers, Stratford, recently developed to high densities, part of the Southbank
and Nine Elms. Outside this second ring, we can still find some clusters of
medium-density such as Richmond, Wembley, East Croydon, Bromley, but
these are rather small compared to the vast areas of low rise - low coverage
around, which make up most of the peripheral boroughs.
52
4.2 Case studies
Figure 18: Case study locations.
The areas taken as case studies (figure 18) have been selected trying to in-
clude different possible scenarios: Angel is a central residential area with an
active high street, Bank a prime central location for offices, Emerson Park
is a suburb-style residential neighbourhood in the outskirt, East Croydon
a dense core around a transportation node and Swiss Cottage is a residen-
tial scheme designed on modernist principle of dispersed medium and high
buildings. For detailed satellite views and maps, see appendix 6.3 (p.83)
53
4.2.1 Case study I - Angel
Credits Microsoft Bing Maps 2016.
Figure 19: Bird’s eye view of Angel.
Angel station, in the borough of Islington, sits in an area at the junction
between three important roads: Upper Street coming from Highbury &
Islington, Pentonville Road coming from Kings Cross and City Road coming
from Old Street. Upper Street is a commercial street, where Pentonville
Road has commerce on just the north side and City Road has residential
and offices only. It is a centrally located, commercial strip surrounded by
residential areas. As we can see in figure 19, the two sides of Upper Street,
next to the junction, are compact and densely built, while the south side
of the junction and beyond those compact blocks, is primarily composed of
small residential buildings. The aggregation by blockBl returns a reasonable
picture of the typology of the environment as it is able to separate the
compact areas (mid-low rise - high coverage) along three main commercial
axis from the more residential areas (mid-low rise - low coverage). In the
bottom-left corner, it discerns between two high-rise towers, one occupying
54
most of the plot and the other set in a little park. The finer classification
of method Pl reveals three mid-high rise - high coverage at the junction
typical of highly exploited plots where commerce activities and offices are
concentrated. The classification does not distinguish between old terrace
houses and relatively recent mid-rise council blocks. This, of course, is due
to the fact that they share the same height and the same coverage, although
the character of the neighbourhood they define and the experience at the
pedestrian level they produce are quite different. The first one faces the
road, has single access to residential units with gardens in the backyard.
The second one is often composed of long bars built with wide setbacks.
4.2.2 Case study II - Bank
Credits Microsoft Bing Maps 2016.
Figure 20: Bird’s eye view of Bank.
Bank is part of the Square Mile, the old part of the city, in the borough of
the City of London. It is a prime business district, with few residents and
a primary function of high-end offices. Because the land is very valuable,
55
it is one of the most densely built areas, with buildings mostly 7-12 storeys
high, but with the smallest population density (figure 35, p.77).
The classification identifies the area as mid-high rise - medium coverage and
mid-high rise - high coverage with the peak of density south of Liverpool
Station where there is a high concentration of towers high-rise - medium
coverage and high-rise - high coverage. This particular area is a very lively
neighbourhood, rich of high-end offices and activities that serve them. It
also makes for a completely deserted place during the weekends, when offices
are closed. This is an important feature that defines the character of the
area and it is not, of course, detected by the classification, that is purely
based on geometric features. This proves to be an important limitation of
this type of classifications.
4.2.3 Case study III - East Croydon
Credits Microsoft Bing Maps 2016.
Figure 21: Bird’s eye view of East Croydon.
56
East Croydon is part of the borough of Croydon, in the south of the city.
It is a mid-high density cluster around a transportation node, within an
area that is dominated by low-rise residential typologies. If we look at the
Bl aggregation, two areas of mid-high density sit on the two sides of the
railway, with mid-low rise - high coverage blocks around, which sharply
change to the low-rise residential areas of terraced houses. If we look at the
results around the tower, east of the railway, for method Pl150 the mid-high
rise class is extended to small plots of two storey buildings, confirming the
intuition that rather than giving an exact result, the aggregation by range
can be interpreted as the perception of the character of the area from the
street.
4.2.4 Case study IV - Emerson Park
Credits Microsoft Bing Maps 2016.
Figure 22: Bird’s eye view of Emerson Park.
Emerson Park is an area in the borough of Havering, at the eastern limit
of Greater London. It is a uniform residential neighbourhood composed of
57
solely single houses or semi-detached houses with large private gardens and
green public areas. The whole area falls consistently in the lower category
of density low rise - low coverage, although at the plot level Pl we can see
minor variations up to mid-low rise.
4.2.5 Case study V - Swiss Cottage
Credits Microsoft Bing Maps 2016.
Figure 23: Bird’s eye view of Swiss Cottage.
The area considered is next to Swiss Cottage station, along Adelaide Road.
It is predominantly residential with large green surfaces and without com-
mercial high street. The building typology is for the vast majority composed
of single apartment buildings between 3 and 6 storeys high with setbacks.
Most of the plots fall into the mid-low rise - medium coverage class, with the
relevant exception of the central block where there are 5 tower blocks above
12 storeys high. If we compare this typology with the residential blocks west
of Angel, which are composed primarily of terraced houses and private gar-
dens, overall the spatial configuration of Adelaide Road deliverers less floor
58
space per unit of land than the traditional terraced house neighbourhoods.
4.2.6 Case study VI - Other observations
Although the present classification returns a realistic picture of the built
environment, some limitations of the methodology emerge from detailed
observation. First of all, how relevant or functional the classes are depends
on the scope of the classification, on the audience the results are aimed at and
on the cultural context. This would affect the choice of the thresholds and
also the number of categories. For example, in order to avoid a multiplication
of classes, in this study, GSI was divided in three bands with the result
that completely rural areas are not differentiated from suburban residential
neighbourhoods.
The same problem of having different spatial layouts achieving the same
dwelling density holds with a two-dimensional index, where the combinations
are greater but still limited. In figure 24 (p.57), the area around Wembley
Stadium is classified as mid-low rise - high coverage, suggesting a dense
urban environment, similar to Angel, whereas it is an area of big warehouses
and car parks. The same problem is visible in Enfield, shown in figure 25
(p.57). If we look at figure 26 (p.58), the area between Angel and King’s
Cross poses another challenge: the zone marked with A and the one with B
share the same class but they are clearly different types of urban landscapes:
A is composed on long blocks recently built while B is mainly composed of
Victorian terraced houses. These problems could be addressed by adding
more geometrical features to the classification, such as the compactness of
the building, footprint shapes or setbacks.
59
Figure 24: Classification method Bl: close-up on the Wembley area.
Figure 25: Classification method Bl: close-up on the Enfield area.
60
Figure 26: Classification method Bl: close-up on the area west of Angel.
61
4.3 Classification summary
Figure 27: Average FSI by classes of urban typology.
FSI describes the level of exploitation of the land in terms of quantity of floor
space by unit of land. Figure 27 (p.59) shows the average of FSI by classes.
Besides the class high rise - high coverage, which corresponds to high towers
in a very dense configuration, the highest density of built environment is
obtained in the mid-high rise - high coverage class, which corresponds to
a Paris-like configuration, where blocks are fully built with six to twelve
storeys high buildings. The level of coverage, GSI, is crucial in delivering
62
high density neighbourhoods.
Figure 28: Summary of the income by classes of urban typology.
As it is shown in the graph, the high-coverage categories are always
denser than the low coverage class of the next higher category, indicating
that denser areas area obtained by mid-high rise building in a compact
configuration, rather than high rise towers in a sparse layout. This also
confirms the observation about terraced houses and modern buildings in
section 4.2.6 (p.56).
If we look at the count of classes across Greater London (figure 29, p. 61),
there is a striking prevalence of low buildings. As it can be seen also in
figure 16 (p.47), the prevalence of green areas is quite clear, but mostly
concentrated in the outer ring. Amongst these areas, there is a very high
63
number of low coverage areas, which correspond to an almost or fully rural
environment. Although this is a common trait of most of the cities, in this
case it is likely a consequence of the Green Belt policy (Thomas 1963), which
greatly limits the expansion areas of the fringe of the city.
Figure 29: Blocks counted by class of urban typology.
4.4 Classes of neighbourhoods and income distribution
Figure 28 (p.60) shows the median income by classes. There is a progression
from lower to higher income in the classes and if aggregated by coverage
(the different tones of colour in the graph), it shows that higher income
classes have a slight preference for low-coverage areas. These have an aver-
age income of £41400, whereas medium-coverage classes about £38600 and
high-coverage classes about £39000. The differences are small but they are
all significant with a t-test confidence level of 95%. When aggregated by
height, there is not a significant difference between the four classes. This is
probably explained by the fact the central areas are generally more in de-
64
Figure 30: FSI vs income by classes of urban typology.
mand for their centrality and can be afforded only by higher income classes
and density is just the consequence of more people wanting to live there.
Figure 30 (p.62) looks at the relationship between the density of built en-
vironment and the income, by comparing FSI and Median Income by Lower
Layer Super Output Areas (LSOA). The graph shows a small significant
positive correlation but the linear model explains just 4% of the variation.
It is interesting, though, to look at local dynamics, to understand whether,
within a relatively homogenous area, such as a borough, there is a clear
trend that is not visible at the global scale. Figure 31 (p.64) looks at the
same relationship by borough. For clarity of visualisation, it shows only sta-
tistically significant results of linear models that can explain at least 5% of
65
the variation. Here there is a clear difference between the central boroughs,
where the correlation is positive and relatively stronger, and the outer ones,
where the correlation is inverted and also weaker. It appears that within
central and more affluent areas, there is a trend for the higher income indi-
viduals to move towards denser areas, where in peripheral areas the opposite
is true. In general, though, the concentration of the x values around a small
range of FSI and the presence of outliers makes the interpretation of the
relationship problematic.
66
Figure 31: FSI v median income by borough.
67
4.5 Classes of neighbourhoods and population density
Figure 32: Population density by classes of urban typology.
Although the built environment density increases by class of coverage (figure
27, p.59), the population density hardly follows this trend (figure 32, p.65).
The graph shows a very mild positive trend and FSI explains just 15%
of the variation in population density (figure 33, p.66), which is quite an
unintuitive result. This is evident when we compare the population density
in figure 35 (p.77) with the classification in figure 37 (p.79): the central area
boroughs of City of London and Westminster with the highest concentration
of high-density classes have also the lowest population density. One possible
explanation is that high density blocks in London have mostly business and
commercial character, where the favourite neighbourhood type for living in
the UK is by far a low density one.
68
Figure 33: Built environment density (FSI) v population density.
69
5 Conclusions
In these pages, I have briefly summarised two radically different views of the
city, which stem from different cultural backgrounds and result in different
approaches towards the city. Although I gave a very limited picture of the
current debate and practice around the urban planning and design, it was
functional to introduce the question of the urban form and its classification.
Most of the studies that I have found in literature, including the one I took
as a reference, have been applied on relatively small portions of the cities
without much automatisation. The work presented here, which stemmed
from these considerations, had the aim of building a workflow characterised
by effectiveness in managing readily available data, scalability in extend-
ing the analysis to large geographical extents and simplicity in the use of
intuitive quantities and indexes. The results of the classification have an
immediate descriptive power but they need further refinement to be able to
catch the vast array of nuances that the urban landscape offers. In improv-
ing the descriptive power, though, it is useful to keep in mind the need for
a balance between general and detail descriptive power and the necessity of
disseminating results to different disciplines.
At one point in the development of this study, I have faced the question
of whether to use a manual classification by setting the thresholds relying
mostly on my cultural experience, or to use classification techniques such
as k −means or DbScan to extract classes out of geometrical features. Af-
ter some tests, I decided to go for the manual approach, because I was not
convinced of the relevancy of the resulting classes obtained by running the
classification algorithms on only two parameters. I think an interesting ap-
proach would have been to run classification algorithms on a much wider
set of geometrical parameters and then trying interpreting the results based
70
on urban typo-morphology, a method tested on a small neighbourhood, for
example, in Gil et al. (2012). Unless already available as datasets, features
used in Gil et al. (2012), such as pavement width, street width, building ori-
entation, pedestrian area, each require a manual definition or specific GIS
techniques to calculate them from a normal city survey. The first way is not
applicable in large extents such as Greater London and the second would
require resources outside the scope of this work. The upside of the approach
described in these pages lies in the use of widely available datasets such as
the city survey or the street network and in the simplicity and immediacy of
the results. The result is a tiny step towards the development of a process
easily understandable and replicable by non-specialised audience.
71
References
Alexander, E. R. (1993), ‘Density measures: A review and analysis’, Journal
of Architectural and Planning Research pp. 181–202.
Alterman, R. (1997), ‘The challenge of farmland preservation: lessons from
a six-nation comparison’, Journal of the American Planning Association
63(2), 220–243.
American Planning Association (1999), Planning communities for the 21st
century, The Association.
Arcaute, E., Hatna, E., Ferguson, P., Youn, H., Johansson, A. & Batty, M.
(2013), ‘Constructing cities, deconstructing scaling laws’, ArXiv e-prints
.
Azhar, S. (2011), ‘Building information modeling (bim): Trends, benefits,
risks, and challenges for the aec industry’, Leadership and Management
in Engineering 11(3), 241–252.
Batty, M. (2013), The new science of cities, MIT press.
Berghauser Pont, M. & Haupt, P. (2005), ‘The Spacemate: Density and the
Typomorphology of the Urban Fabric’, Urbanism Laboratory for Cities
and Regions Progress of Research Issues in Urbanism 2007 4(4), 55–68.
Berghauser Pont, M. & Haupt, P. (2010), Space, density and urban form,
NAi Publishers Rotterdam.
Burchell, R. W. & Listokin, D. (1995), Land, infrastructure, housing costs
and fiscal impacts associated with growth: The literature on the impacts
of sprawl versus managed growth, Technical report, Lincoln Institute of
Land Policy.
72
Burchell, R. W., Shad, N. A., Listokin, D., Phillips, H., Downs, A., Seskin,
S., Davis, J. S., Moore, T., Helton, D. & Gall, M. (1998), The costs of
sprawl-revisited, Transit Cooperative Research Program.
Burton, T. & Matson, L. (1996), ‘Urban footprints: making best use of
urban land and resources?a rural perspective’, The compact city: A sus-
tainable urban form pp. 298–301.
CENTRO (2012), Annual Statistical Report, Technical report, West Mid-
lands Integrated Transport Authority.
Cervero, R. & Guerra, E. (2011), Urban densities and transit: A multi-
dimensional perspective, Institute of Transportation Studies, University
of California, Berkeley.
CETAT (1986), Indicateurs morphologiques pour l’amenagement: analyse de
50 perimetres batis situes sur le Canton de Geneve. Presentation generale.
Vol. 1, Departement des traveaux publics de Geneve.
URL: https://books.google.co.uk/books?id=s4fWZwEACAAJ
Churchman, A. (1999), ‘Disentangling the concept of density’, Journal of
Planning Literature 13(4), 389–411.
Commission of the European Communities (1990), ‘Green paper on the ur-
ban environment’.
Conzen, M. P. (1978), ‘Analytical approaches to the urban landscape’, Di-
mensions of human geography. Chicago: University of Chicago pp. 128–65.
Corbusier, L. (1964), La ville radieuse, Vincent, Freal & Cie.
Dantzig, G. B. & Saaty, T. L. (1973), Compact city: a plan for liveable urban
environment, Freeman.
73
Dark, S. J. & Bram, D. (2007), ‘The modifiable areal unit problem (maup)
in physical geography’, Progress in Physical Geography 31(5), 471–479.
ECOTEC (1993), ‘Reducing transport emissions through planning’.
EEA (1999), Environment in the European Union at the turn of the century,
Technical report, European Environment Agency (EEA).
Ewing, R. (1997), ‘Is Los Angeles-style sprawl desirable?’, Journal of the
American planning association 63(1), 107–126.
Frank, J. E. (1989), The costs of alternative development patterns: A review
of the literature, Urban Land Inst.
Gil, J., Beirao, J. N., Montenegro, N. & Duarte, J. P. (2012), ‘On the
discovery of urban typologies: data mining the many dimensions of urban
form’, Urban morphology 16(1), 27.
Gordon, P. & Richardson, H. W. (1997), ‘Are Compact Cities a Desir-
able Planning Goal?’, Journal of the American Planning Association
63(1), 95–106.
Greater London Authority (2006), ‘London plan’.
Hillier, B. (2007), ‘Space is the machine: a configurational theory of archi-
tecture’.
Jackson, K. T. (1985), Crabgrass frontier: The suburbanization of the United
States, Oxford University Press.
Jacobs, J. (1961), The death and life of great American cities, Vintage.
Krupat, E. (1985), People in cities: The urban environment and its effects,
number 6, Cambridge University Press.
74
Leccese, M. & McCormick, K. (2000), Charter of the new urbanism,
McGraw-Hill Professional.
Marshall, S. (2009), ‘Cities, design and evolution’.
Martin, L. & March, L. (1972), Urban space and structures, number 1, Cam-
bridge University Press.
Maryland Department of Planning (1997).
Melia, S., Parkhurst, G. & Barton, H. (2011), ‘The paradox of intensifica-
tion’, Transport Policy 18(1), 46–52.
Moudon, A. V. (1994), Getting to know the building landscape: typomor-
phology, in K. Franck & L. Schneekloth, eds, ‘Ordering space: types in
architecture and design’, Van Nostrand Reinhold, New York, pp. 289–311.
Moudon, A. V. & Lee, C. (2009), ‘Urbanism by numbers’, Making the
Metropolitan Landscape: Standing Firm on Middle Ground p. 57.
Muratori, S. (1960), Studi per un’operante storia urbana di Venezia, Istituto
poligrafico dello Stato, Roma.
Muthesius, S. (1982), The English terraced house, Vol. 140, Yale University
Press New Haven.
National Research Council (1999), Our common journey: a transition to-
ward sustainability, National Academies Press.
Newman, P. (1992), ‘The compact city: an australian perspective’, Built
Environment (1978-) pp. 285–300.
Newman, P. G. & Kenworthy, J. R. (1989a), Cities and automobile depen-
dence: An international sourcebook, Gower Publishing.
75
Newman, P. W. & Kenworthy, J. R. (1989b), ‘Gasoline consumption and
cities: a comparison of us cities with a global survey’, Journal of the
american planning association 55(1), 24–37.
OECD (2012), Compact City Policies: A Comparative Assesment, OECD
Publishing.
Patricios, N. (2002), ‘Urban design principles of the original neighborhood
concepts’, Urban morphology 6(1).
Per, A. F. (2008), D Book: Density, Data, Diagrams, Dwellings, a+ t edi-
ciones.
Pont, M. B. & Marcus, L. (2014), ‘Innovations in measuring density: From
area and location density to accessible and perceived density’, Nordic
Journal of Architectural Research 26(2).
Priest, D. (1977), Large-scale development: benefits, constraints, and state
and local policy incentives, Urban Land Institute.
Ratti, C., Baker, N. & Steemers, K. (2005), ‘Energy consumption and urban
texture’, Energy and Buildings 37(7), 762–776.
Real Estate Research Corporation (1974), The costs of sprawl: Environmen-
tal and economic costs of alternative residential development patterns at
the urban fringe, Technical report, Council on Environmental Quality.
Schumacher, P. (2009), ‘Parametricism: A new global style for architecture
and urban design’, Architectural Design 79(4), 14–23.
Schumacher, P. (2011), The Autopoiesis of Architecture: a new framework
for Architecture, Vol. 1, John Wiley & Sons.
76
Sir Howard, E. (1898), To-morrow: A Peaceful Path to Real Reform, Rout-
ledger.
Stretton, H. (1996), Density, Efficacy and Equality in Australian Cities, in
M. Jenks, E. Burton & K. Williams, eds, ‘The Compact City: a Sus-
tainable Urban Form?’, E&FN SPON, London and New York, chapter
Compact City Theory.
Thomas, D. (1963), ‘London’s green belt: the evolution of an idea’, The
Geographical Journal 129(1), 14–24.
UK Government (1994), ‘Sustainable development: The UK strategy’.
United Nations (1992), Agenda 21, Technical report, United Nations.
Williams, K., Burton, E. & Jenks, M. (1996), ‘Achieving the compact city
through intensification: an acceptable option’, The compact city: A sus-
tainable urban form pp. 83–96.
Windsor, D. (1979), ‘A critique of the costs of sprawl’, Journal of the Amer-
ican Planning Association 45(3), 279–292.
77
78
6 Appendix
6.1 London data: overview maps
Figure 34: London data: building heights
79
Figure 35: London data: population density by LSOA
80
Figure 36: London data: median income by LSOA
81
6.2 Classification results: overview maps
Figure 37: Classification method Bl: London overview
82
Figure 38: Classification method Bl400: London overview
83
Figure 39: Classification method Pl: London overview
84
Figure 40: Classification method Pl150: London overview
85
6.3 Case studies: detailed maps
Figure 41: Case study locations
86
6.3.1 Case study I: Angel
Figure 42: Angel: Satellite view and building heights
87
Figure 43: Angel: Method Bl by block and Bl400 by block in range 400 m.
88
Figure 44: Angel: Method Pl by plot and Pl150 by plot in range 150 m.
89
6.3.2 Case study II: Bank
Figure 45: Bank: Satellite view and building heights
90
Figure 46: Bank: Method Bl by block and Bl400 by block in range 400 m.
91
Figure 47: Bank: Method Pl by plot and Pl150 by plot in range 150 m.
92
6.3.3 Case study III: East Croydon
Figure 48: East Croydon: Satellite view and building heights
93
Figure 49: East Croydon: Method Bl by block and Bl400 by block in range 400 m.
94
Figure 50: East Croydon: Method Pl by plot and Pl150 by plot in range 150 m.
95
6.3.4 Case study IV : Emerson Park
Figure 51: Emerson Park: Satellite view and building heights
96
Figure 52: Emerson Park: Method Bl by block and Bl400 by block in range 400
m.
97
Figure 53: Emerson Park: Method Pl by plot and Pl150 by plot in range 150 m.
98
6.3.5 Case study V : Swiss Cottage
Figure 54: Swiss Cottage: Satellite view and building heights
99
Figure 55: Swiss Cottage: Method Bl by block and Bl400 by block in range 400
m.
100
Figure 56: Swiss Cottage: Method Pl by plot and Pl150 by plot in range 150 m.
101
6.4 Code
6.4.1 Data import
1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban
Ana lyt i c s at CASA − UCL
2 # This s c r i p t i s used to import data in to postgreSQL / postGIS
t a b l e s
3 from subproces s import run
4 import os
5 import datet ime
6 import psycopg2
7
8
9
10 #### OS Mastermap − Topographic l a y e r ####
11 ## Bui ld ing f o o t p r i n t shapes
12 # In order to import gml f i l e s o f the OS Mastermap topographic
layer , the s c r i p t nav iga te s through the data f o l d e r
13 # r e t r i v e s the name o f the f i l e s with a s p e c i f i c ex t ens i on and
run ogr2ogr in the command s h e l l
14 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n
/03 Data/London/OS/ ’
15 ogr s tatement = ’ ogr2ogr −f ”PostgreSQL” PG: ” host=l o c a l h o s t
dbname=msc user=pos tg r e s password=pos tg r e s schemas=
london bu i l d i ng s ” SPATIAL INDEX = FALSE ’
16 f i l e n a m e s = [ ]
17 f i l e r o o t s = [ ]
18 # Retr i eve f i l e names and paths
19 # See http :// s tackove r f l ow . com/ que s t i on s /3964681/ f ind−a l l− f i l e s −
in−d i r e c to ry−with−extens ion−txt−in−python
20 f o r root , d i r s , f i l e s in os . walk ( path ) :
21 f o r f i l e in f i l e s :
22 i f f i l e . endswith ( ” . gml . gz” ) :
102
23 f i l e r o o t s . append ( root )
24 f i l e n a m e s . append ( f i l e )
25
26 terminal command = ’ cd ’ + f i l e r o o t s [ 0 ] + ’ ; ’ + ogr s tatement
+ f i l e n a m e s [ 0 ]
27 # Star t import ing proce s s
28 dt = datet ime . datet ime . now ( )
29 pr in t ( ”Now running on” + f i l e r o o t s [ 0 ] + ”/” + f i l e n a m e s [ 0 ] )
30 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
31 r e turn code = run ( terminal command , s h e l l=True ) # import the
f i r s t f i l e , which c r e a t e s the t ab l e
32 dt = datet ime . datet ime . now ( )
33 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute ) .
z f i l l ( 2 ) )
34 # Al l the o the r s are appended to the newly c rea ted t a b l e
35 return msg = [ ]
36 f o r i in range (1 , l en ( f i l e n a m e s ) ) :
37 f i l e n a m e = f i l e n a m e s [ i ]
38 f i l e r o o t = f i l e r o o t s [ i ]
39 terminal command = ’ cd ’ + f i l e r o o t + ’ ; ’ + ogr statement +
f i l e n a m e
40 dt = datet ime . datet ime . now ( )
41 pr in t ( ”Now running on” + f i l e r o o t + ”/” + f i l e n a m e )
42 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
43 return msg . append ( run ( terminal command + ’ −append ’ , s h e l l=
True ) )
44 dt = datet ime . datet ime . now ( )
45 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute
) . z f i l l ( 2 ) )
46 pr in t ( ”Job done” )
47 # Create geometr ic index
103
48 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
49 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
50 cur = conn . cur so r ( )
51 cur . execute ( ’ ’ ’
52 CREATE INDEX topographicarea wkb geometry geom idx
53 ON london bu i l d i ng s . topograph icarea
54 USING g i s t
55 ( wkb geometry ) ;
56 CREATE INDEX topographic l ine wkb geometry geom idx
57 ON london bu i l d i ng s . t o p o g r a p h i c l i n e
58 USING g i s t
59 ( wkb geometry ) ;
60 CREATE INDEX cartographictext wkb geometry geom idx
61 ON london bu i l d i ng s . c a r t o g r a p h i c t e x t
62 USING g i s t
63 ( wkb geometry ) ;
64 ’ ’ ’ )
65 conn . commit ( )
66 conn . c l o s e ( )
Listing 1: Data import OS TopographicLayer.py
1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban
Ana lyt i c s at CASA − UCL
2 # This s c r i p t i s used to import data in to postgreSQL / postGIS
t a b l e s
3 from subproces s import run
4 import os
5 import datet ime
6 import psycopg2
7
8 #### INSPIRE ####
9 ## Cadastra l p a r c e l s
104
10 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n
/03 Data/London/INSPIRE/ ’
11 ogr s tatement = ’ ogr2ogr −f ”PostgreSQL” PG: ” host=l o c a l h o s t
dbname=msc user=pos tg r e s password=pos tg r e s schemas=
london p lo t s ” SPATIAL INDEX = FALSE ’
12 f i l e n a m e s = [ ]
13 f i l e r o o t s = [ ]
14 f o r root , d i r s , f i l e s in os . walk ( path ) :
15 f o r f i l e in f i l e s :
16 i f f i l e . endswith ( ” . gml” ) :
17 f i l e r o o t s . append ( root )
18 f i l e n a m e s . append ( f i l e )
19 # The import c r e a t e s the ta b l e
20 terminal command = ’ cd ’ + f i l e r o o t s [ 0 ] + ’ ; ’ + ogr s tatement
+ f i l e n a m e s [ 0 ]
21 # Star t import ing proce s s
22 dt = datet ime . datet ime . now ( )
23 pr in t ( ”Now running on” + f i l e r o o t s [ 0 ] + ”/” + f i l e n a m e s [ 0 ] )
24 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
25 r e turn code = run ( terminal command , s h e l l=True ) # import the
f i r s t f i l e to c r e a t e the ta b l e
26 dt = datet ime . datet ime . now ( )
27 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute ) .
z f i l l ( 2 ) )
28 # Al l the o the r s are appended to the newly c rea ted t a b l e
29 return msg = [ ]
30 f o r i in range (1 , l en ( f i l e n a m e s ) ) :
31 f i l e n a m e = f i l e n a m e s [ i ]
32 f i l e r o o t = f i l e r o o t s [ i ]
33 terminal command = ’ cd ’ + f i l e r o o t + ’ ; ’ + ogr statement +
f i l e n a m e
34 dt = datet ime . datet ime . now ( )
105
35 pr in t ( ”Now running on” + f i l e r o o t + ”/” + f i l e n a m e )
36 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
37 return msg . append ( run ( terminal command + ’ −append ’ , s h e l l=
True ) )
38 dt = datet ime . datet ime . now ( )
39 pr in t ( ”Job done” )
40 # Clean geometry and c r e a t e geometry index
41 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
42 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
43 cur = conn . cur so r ( )
44 cur . execute ( ’ ’ ’
45 CREATE TABLE london p lo t s . p l o t s
46 AS
47 SELECT ∗ FROM london p lo t s . p r ede f i ned ;
48 UPDATE london p lo t s . p l o t s
49 SET wkb geometry = cleanGeometry ( wkb geometry ) ;
50 ALTER TABLE london p lo t s . p l o t s ADD PRIMARY KEY ( o g c f i d ) ;
51 CREATE INDEX plot s geom idx
52 ON london p lo t s . p l o t s
53 USING g i s t
54 ( wkb geometry ) ;
55 ’ ’ ’ )
56 conn . commit ( )
57 conn . c l o s e ( )
Listing 2: Data import InspireCadastralPlots.py
1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban
Ana lyt i c s at CASA − UCL
2 # This s c r i p t i s used to import data in to postgreSQL / postGIS
t a b l e s
3 from subproces s import run
106
4 import os
5 import datet ime
6 import psycopg2
7
8
9 #### OS Mastermap − ITN ####
10 ## ITN f i l e o f the t r a n s p o r t a t i o n network o f England
11 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n
/03 Data/OS ITN−Ful l England / ’
12 ogr s tatement = ’ ogr2ogr −f ”PostgreSQL” PG: ” host=l o c a l h o s t
dbname=msc user=pos tg r e s password=pos tg r e s schemas=
eng l and i tn ” SPATIAL INDEX = FALSE ’
13 f i l e n a m e s = [ ]
14 f i l e r o o t s = [ ]
15 f o r root , d i r s , f i l e s in os . walk ( path ) :
16 f o r f i l e in f i l e s :
17 i f f i l e . endswith ( ” . gml” ) :
18 f i l e r o o t s . append ( root )
19 f i l e n a m e s . append ( f i l e )
20 # The import c r e a t e s the ta b l e ( http ://www. gdal . org / drv pg . html )
21 terminal command = ’ cd ’ + f i l e r o o t s [ 0 ] + ’ ; ’ + ogr s tatement
+ f i l e n a m e s [ 0 ]
22 # Star t import ing proce s s
23 dt = datet ime . datet ime . now ( )
24 pr in t ( ”Now running on” + f i l e r o o t s [ 0 ] + ”/” + f i l e n a m e s [ 0 ] )
25 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
26 r e turn code = run ( terminal command , s h e l l=True ) # import the
f i r s t f i l e which c r e a t e s the ta b l e
27 dt = datet ime . datet ime . now ( )
28 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute ) .
z f i l l ( 2 ) )
29 # Al l the o the r s are appended to the newly c rea ted t a b l e
107
30 r e tu rn msg i tn = [ ]
31 f o r i in range (2 , l en ( f i l e n a m e s ) ) :
32 f i l e n a m e = f i l e n a m e s [ i ]
33 f i l e r o o t = f i l e r o o t s [ i ]
34 terminal command = ’ cd ’ + f i l e r o o t + ’ ; ’ + ogr statement +
f i l e n a m e
35 dt = datet ime . datet ime . now ( )
36 pr in t ( ”Now running on” + f i l e r o o t + ”/” + f i l e n a m e )
37 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
38 r e tu rn msg i tn . append ( run ( terminal command + ” −s k i p f a i l u r e s −
append” , s h e l l=True ) )
39 dt = datet ime . datet ime . now ( )
40 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute
) . z f i l l ( 2 ) )
41 pr in t ( ”Job done” )
42 # Create geometr ic index
43 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
44 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
45 cur = conn . cur so r ( )
46 cur . execute ( ’ ’ ’
47 CREATE INDEX ferrynode wkb geometry geom idx
48 ON eng land i tn . f e r rynode
49 USING g i s t
50 ( wkb geometry ) ;
51 CREATE INDEX informat ionpoint wkb geometry geom idx
52 ON eng land i tn . in f o rmat ionpo in t
53 USING g i s t
54 ( wkb geometry ) ;
55 CREATE INDEX roadl ink wkb geometry geom idx
56 ON eng land i tn . r oad l i nk
57 USING g i s t
108
58 ( wkb geometry ) ;
59 CREATE INDEX roadl ink in format ion wkb geometry geom idx
60 ON eng land i tn . r oad l i nk in f o rmat i on
61 USING g i s t
62 ( wkb geometry ) ;
63 CREATE INDEX roadnode wkb geometry geom idx
64 ON eng land i tn . roadnode
65 USING g i s t
66 ( wkb geometry ) ;
67 CREATE INDEX roadroute in format ion wkb geometry geom idx
68 ON eng land i tn . roadroute in fo rmat ion
69 USING g i s t
70 ( wkb geometry ) ;
71 ’ ’ ’ )
72 conn . commit ( )
73 conn . c l o s e ( )
Listing 3: Data import OS ITN.py
1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban
Ana lyt i c s at CASA − UCL
2 # This s c r i p t i s used to import data in to postgreSQL / postGIS
t a b l e s
3 import os
4 import datet ime
5 import psycopg2
6
7
8 #### OS Mastermap ####
9 ## Bui ld ing Heights
10 # Create t a b l e
11 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
12 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
109
13 cur = conn . cur so r ( )
14 cur . execute ( ’ ’ ’
15 DROP TABLE london bu i l d i ng s . b u i l d i n g h e i g h t s CASCADE;
16 CREATE TABLE london bu i l d i ng s . b u i l d i n g h e i g h t s (
17 o s topo to id d i g imap VARCHAR(48) ,
18 o s t o p o t o i d VARCHAR(48) NOT NULL,
19 o s t o p o v e r s i o n VARCHAR(48) ,
20 bha proces sdate VARCHAR(24) ,
21 t i l e r e f VARCHAR(24) ,
22 abshmin NUMERIC(5 , 2) ,
23 absh2 NUMERIC(5 , 2) ,
24 abshmax NUMERIC(5 , 2) ,
25 r e l h2 NUMERIC(5 , 2) ,
26 relmax NUMERIC(5 , 2) ,
27 bha conf VARCHAR(24) ,
28 PRIMARY KEY ( o s t o p o t o i d )
29 ) ;
30 ’ ’ ’ )
31 conn . commit ( )
32 conn . c l o s e ( )
33 # Clean the data from d u p l i c a t e s
34 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n
/03 Data/London/OS/ ’
35 f i l e p a t h s = [ ]
36 f i l e n a m e s = [ ]
37 b u i l d i n g h e i g h t s = pd . DataFrame ( )
38 f o r root , d i r s , f i l e s in os . walk ( path ) :
39 f o r f i l e in f i l e s :
40 i f f i l e . endswith ( ” . csv ” ) :
41 f i l e p a t h s . append ( root + ’ / ’ + f i l e )
42 f i l e n a m e s . append ( f i l e )
43 f o r f i l e in f i l e p a t h s :
44 bh temp = pd . r ead c sv ( f i l e )
110
45 b u i l d i n g h e i g h t s = b u i l d i n g h e i g h t s . append ( bh temp )
46 dup l i ca t ed = b u i l d i n g h e i g h t s . dup l i ca t ed ( ) # There are a t o t a l
o f 7 rows complete ly dup l i ca t ed
47 b u i l d i n g h e i g h t s . d r o p d u p l i c a t e s ( keep=’ f i r s t ’ , i n p l a c e=True )
48 d u p l i c a t e d i d s = b u i l d i n g h e i g h t s . dup l i ca t ed ( ’ o s t o p o t o i d ’ )
49 d u p l i c a t ed i d s d i g i m a p = b u i l d i n g h e i g h t s . dup l i ca t ed ( ’
o s t opo to id d i g imap ’ ) # No other d u p l i c a t e s
50 b u i l d i n g h e i g h t s . t o c s v ( ’ / Users / ducc ioa /CLOUD/
C07 UCL SmartCities /08 D i s s e r t a t i o n /03 Data/London/OS/
b u i l d i n g h e i g h t s . csv ’ , index = False , i n d e x l a b e l= False )
51 # Import to Pos tg r e sq l
52 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
53 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
54 cur = conn . cur so r ( )
55 f i l e = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n
/03 Data/London/OS/ b u i l d i n g h e i g h t s . csv ’
56 dt = datet ime . datet ime . now ( )
57 pr in t ( ”START: ” + f i l e )
58 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
59 sq l s t a t ement = ’COPY london bu i l d i ng s . b u i l d i n g h e i g h t s FROM \ ’ ’
+ f i l e + ’ \ ’ CSV HEADER; ’
60 cur . execute ( ’ ’ ’%s ’ ’ ’ %sq l s t a t ement )
61 dt = datet ime . datet ime . now ( )
62 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute ) .
z f i l l ( 2 ) )
63 conn . commit ( )
64 conn . c l o s e ( )
Listing 4: Data import BuildingHeights.py
1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban
Ana lyt i c s at CASA − UCL
111
2 # This s c r i p t i s used to import data in to postgreSQL / postGIS
t a b l e s
3 from subproces s import run
4 import datet ime
5 import psycopg2
6 from sqla lchemy import c r e a t e e n g i n e
7 import pandas as pd
8 #### London ’ s a d m i n i s t r a t i v e boundar ies ####
9 ## London ’ s boroughs
10 # Create SCHEMA
11 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
12 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
13 cur = conn . cur so r ( )
14 cur . execute ( ’ ’ ’
15 DROP SCHEMA london CASCADE;
16 CREATE SCHEMA london
17 AUTHORIZATION pos tg r e s ;
18 ’ ’ ’ )
19 conn . commit ( )
20 conn . c l o s e ( )
21 # Import
22 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /04
Spat ia lDataCapture /00 Coursework / LondonGentr i f i ca t ion /Data/
ESRI/Boroughs/ ’
23 f i l ename = ’ england lad 2011Polygon . shp ’
24 s h a p e f i l e = path + f i l ename
25 schema = ’ london . ’
26 t ab l e = ’ boroughs ’
27 opt ions = ’−I −s 27700 ’
28 s e r v e r = ’ | psq l −d msc −U pos tg r e s −W’
29 terminal command = ’ shp2pgsql %s %s %s%s %s ’ %(opt ions ,
s h a p e f i l e , schema , tab le , s e r v e r )
112
30 dt = datet ime . datet ime . now ( )
31 pr in t ( ’ Importing ”%s ” ’ %(f i l ename ) )
32 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
33 r e turn code = run ( terminal command , s h e l l=True ) # import the
f i r s t f i l e to c r e a t e the ta b l e
34 dt = datet ime . datet ime . now ( )
35 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute ) .
z f i l l ( 2 ) )
36
37 ## London ’ s wards
38 # Import
39 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /04
Spat ia lDataCapture /00 Coursework / LondonGentr i f i ca t ion /Data/
ESRI/ ’
40 f i l ename = ’ London Ward CityMerged . shp ’
41 s h a p e f i l e = path + f i l ename
42 schema = ’ london . ’
43 t ab l e = ’ wards ’
44 opt ions = ’−I −s 27700 ’
45 s e r v e r = ’ | psq l −d msc −U pos tg r e s −W’
46 terminal command = ’ shp2pgsql %s %s %s%s %s ’ %(opt ions ,
s h a p e f i l e , schema , tab le , s e r v e r )
47 dt = datet ime . datet ime . now ( )
48 pr in t ( ’ Importing ”%s ” ’ %(f i l ename ) )
49 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
50 r e turn code = run ( terminal command , s h e l l=True ) # import the
f i r s t f i l e to c r e a t e the ta b l e
51 dt = datet ime . datet ime . now ( )
52 pr in t ( ’End Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt . minute ) .
z f i l l ( 2 ) )
53
113
54 ## Greater London
55 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
56 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
57 cur = conn . cur so r ( )
58 cur . execute ( ’ ’ ’
59 DROP TABLE london . g rea te r l ondon CASCADE;
60 CREATE TABLE london . g rea te r l ondon
61 AS
62 SELECT ST union (geom) geom FROM london . boroughs ;
63 ALTER TABLE london . g rea te r l ondon
64 ADD COLUMN id BIGSERIAL PRIMARY KEY;
65 CREATE INDEX greater london geom idx
66 ON london . g rea te r l ondon
67 USING g i s t
68 (geom) ;
69 ’ ’ ’ )
70 conn . commit ( )
71 conn . c l o s e ( )
72
73 ## River Thames − Remove r i v e r from blocks
74 terminal command=” shp2pgsql −I −s 27700 / Users / ducc ioa /CLOUD/
C07 UCL SmartCities /08 D i s s e r t a t i o n /03 Data/London/
River Thames/ S i m p l i f i e d / r ive r thames . shp support . r i ve r thames
−e x p o l o d e c o l l e c t i o n s | psq l −d msc −U pos tg r e s −W”
75 r e turn code = run ( terminal command , s h e l l=True )
76 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
77 pr in t ( ”Open conncet ion : s u c c e s s f u l ” )
78 cur = conn . cur so r ( )
79 cur . execute ( ’ ’ ’
80 CREATE TABLE london . r i ve r thames (
81 id s e r i a l ,
114
82 geom geometry
83 ) ;
84 INSERT INTO london . r i ve r thames (geom)
85 (SELECT ( st dump (geom) ) . geom from support . r i ve r thames ) ;
86 ALTER TABLE london . r i ve r thames
87 ADD PRIMARY KEY ( id ) ;
88 CREATE INDEX river thames geom idx
89 ON london . r i ve r thames
90 USING g i s t
91 (geom) ;
92 −− Find b locks that f a l l with in the r i v e r ’ s shape
93 CREATE TABLE support . r i v e r b l o c k s AS
94 (
95 SELECT p . b lock id , p . wkb geometry
96 FROM london b locks . b locks AS p
97 INNER JOIN london . r i ve r thames AS n
98 ON ST within (p . wkb geometry , ST buf fe r (n . geom , 20) )
99 ) ;
100 ALTER TABLE support . r i v e r b l o c k s
101 ADD PRIMARY KEY ( b l o c k i d ) ;
102 CREATE INDEX r i v e r b l o c k s g e o m i d x
103 ON support . r i v e r b l o c k s
104 USING g i s t
105 ( wkb geometry ) ;
106 INSERT INTO support . r i v e r b l o c k s
107 (SELECT block id , wkb geometry FROM london b locks . b locks
WHERE b l o c k i d IN (60874 , 47766 ,48147 ,48111) ) ;
108 DELETE FROM london b locks . b locks
109 WHERE b l o c k i d IN (SELECT b l o c k i d FROM london b locks . b locks
WHERE b l o c k i d IN (SELECT b . b l o c k i d FROM support .
r i v e r b l o c k s b) ) ;
110 ’ ’ ’ )
111 conn . commit ( )
115
112 conn . c l o s e ( )
Listing 5: Data import LondonAdministrativeBoundaries.py
6.4.2 Data cleaning
1 −− DROP SCHEMA london i tn topo l ogy CASCADE;
2 −− DELETE FROM topology . topology WHERE name = ’
l ondon i tn topo l ogy ’ ;
3 SELECT CreateTopology ( ’ l ondon i tn topo l ogy ’ , 27700 , 0 . 1 ) ;
4 CREATE SCHEMA temp itn
5 AUTHORIZATION pos tg r e s ;
6 −− Barking and Dagenham 22
7 BEGIN;
8 SET LOCAL work mem = ’ 96MB’ ;
9 c r e a t e t a b l e temp itn . road l ink22 as (
10 s e l e c t ∗ from london i tn . r oad l i nk
11 where S T i n t e r s e c t s ( wkb geometry , ( s e l e c t geom from london .
boroughs where name = ’ Barking and Dagenham ’ ) )
12 ) ;
13 a l t e r t a b l e temp itn . road l ink22
14 add primary key ( o g c f i d ) ;
15 c r e a t e index r o a d l i n k 2 2 s p a t i a l i d x
16 on temp itn . road l ink22
17 us ing g i s t
18 ( wkb geometry ) ;
19 COMMIT;
20
21 BEGIN;
22 SET LOCAL work mem = ’ 512MB’ ;
23 SELECT
24 o g c f i d ,
25 TopoGeo AddLineString (
26 ’ l ondon i tn topo l ogy ’ , wkb geometry
27 ) As edge id
116
28 FROM (
29 SELECT o g c f i d , wkb geometry FROM temp itn . road l ink22
30 ) As f ;
31 COMMIT;
32 −− The proce s s has to be repeated f o r each o f the 33 boroughs
33 −− Railway
34 CREATE TABLE london i tn . ra i lway AS (
35 SELECT
36 ) ;
37 BEGIN;
38 SET LOCAL work mem = ’ 96MB’ ;
39 c r e a t e t a b l e temp itn . ra i lway as (
40 s e l e c t ∗ from london i tn . r oad l i nk
41 where S T i n t e r s e c t s ( wkb geometry , ( s e l e c t geom from london .
boroughs where name = ’ Westminister ’ ) )
42 ) ;
43 a l t e r t a b l e temp itn . road l ink30
44 add primary key ( o g c f i d ) ;
45 c r e a t e index r o a d l i n k 3 0 s p a t i a l i d x
46 on temp itn . road l ink30
47 us ing g i s t
48 ( wkb geometry ) ;
49 COMMIT;
50
51 BEGIN;
52 SET LOCAL work mem = ’ 512MB’ ;
53 SELECT
54 o g c f i d ,
55 TopoGeo AddLineString (
56 ’ l ondon i tn topo l ogy ’ , wkb geometry
57 ) As edge id
58 FROM (
59 SELECT o g c f i d , wkb geometry FROM temp itn . road l ink30
117
60 ) As f ;
61 COMMIT;
62
63
64 −− RAILWAY
65 BEGIN;
66 SET LOCAL work mem = ’ 512MB’ ;
67 SELECT
68 gid ,
69 TopoGeo AddLineString (
70 ’ l ondon i tn topo l ogy ’ , s t l i n e m e r g e (geom)
71 ) As edge id
72 FROM (
73 SELECT gid , geom FROM london i tn . o v e r g r o u n d r a i l
74 ) As f ;
75 COMMIT;
76
77 −− RIVER THAMES
78 BEGIN;
79 SET LOCAL work mem = ’ 512MB’ ;
80 SELECT
81 id ,
82 TopoGeo AddLineString (
83 ’ l ondon i tn topo l ogy ’ , ST Exter iorRing (geom)
84 ) As edge id
85 FROM (
86 SELECT id , geom FROM london . r i ve r thames
87 ) As f ;
88 COMMIT;
89
90
91
92 −−−− Create block polygons −−−−
118
93 −− DROP SCHEMA london b locks CASCADE;
94 −− DROP TABLE london b locks . b locks CASCADE;
95 −− CREATE SCHEMA london b locks
96 −− AUTHORIZATION pos tg r e s ;
97 CREATE TABLE london b locks . b locks (
98 b l o c k i d int ,
99 wkb geometry geometry ,
100 a r ea b l o ck rea l ,
101 compact block rea l ,
102 pe r imet e r b l o ck rea l ,
103 borough code charac t e r varying ,
104 borough name charac t e r vary ing
105 ) ;
106
107 DO
108 $do$
109 DECLARE i i n t ;
110 BEGIN
111 FOR i IN SELECT f a c e i d FROM london i tn topo l ogy . f a c e WHERE
f a c e i d !=0 LOOP
112 INSERT INTO london b locks . b locks (SELECT i ,
ST GetFaceGeometry ( ’ l ondon i tn topo l ogy ’ , i ) ) ;
113 END LOOP;
114 END
115 $do$ ;
116 ALTER TABLE london b locks . b locks
117 ADD PRIMARY KEY ( b l o c k i d ) ;
118 CREATE INDEX london b locks
119 ON london b locks . b locks
120 USING g i s t
121 ( wkb geometry ) ;
122
123
119
124 DELETE FROM london b locks . b locks
125 WHERE ST within (
126 (SELECT ST centro id ( wkb geometry ) FROM london b locks . b locks )
,
127 (SELECT geom FROM london . r i ve r thames )
128 ) ;
129 UPDATE london b locks . b locks SET area b l o ck = ST area (
wkb geometry ) ;
130 UPDATE london b locks . b locks SET compact block = area b l o ck /(
ST Area ( ST MinimumBoundingCircle ( wkb geometry ) ) ) ;
131 UPDATE london b locks . b locks SET per imet e r b l o ck = ST perimeter (
wkb geometry ) ;
Listing 6: Data createTopology.sql
1 −− Dupl icate topographic t a b l e as backup measure
2 c r e a t e t a b l e l ondon bu i l d i ng s . b u i l d i n g s h a p e s
3 as
4 s e l e c t ∗ from london bu i l d i ng s . topograph icarea ;
5 −− add new columns
6 a l t e r t a b l e l ondon bu i l d i ng s . b u i l d i n g s h a p e s −− done
7 add column r e l h r e a l d e f a u l t 0 ,
8 add column area r e a l d e f a u l t 0 ,
9 add column compactness r e a l d e f a u l t 0 ,
10 add column n f l o o r s i n t e g e r d e f a u l t 0 ;
11
12
13 −− Join with the b u i l d i n g h e i g h t s t ab l e
14 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s s SET r e l h = h . r e l h2
FROM london bu i l d i ng s . b u i l d i n g h e i g h t s h WHERE s . f i d = h .
o s t o p o t o i d ; −− done
15 −− Add area
16 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET area = ST area ( s .
wkb geometry ) FROM london bu i l d i ng s . b u i l d i n g s h a p e s s ;
120
17 −− Add n f l o o r s
18 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET n f l o o r s = s . r e l h /3
FROM london bu i l d i ng s . b u i l d i n g s h a p e s s ;
19 −− Add compactness
20 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET compactness = s . area
/( ST Area ( ST MinimumBoundingCircle ( s . wkb geometry ) ) FROM
london bu i l d i ng s . b u i l d i n g s h a p e s s ;
21 −− Add n f l o o r s with average 3 .5
22 a l t e r t a b l e l ondon bu i l d i ng s . b u i l d i n g s h a p e s
23 add column n f l o o r s 3 5 0 i n t e g e r d e f a u l t 0 ;
24 UPDATE london bu i l d i ng s . b u i l d i n g s h a p e s SET n f l o o r s = s . r e l h
/3 .5 FROM london bu i l d i ng s . b u i l d i n g s h a p e s s ;
Listing 7: Data JoinBuildingHeights.sql
1 −− Subset the b u i l d i n g s h a p e s t a b l e
2 CREATE TABLE london bu i l d i ng s . shapes AS
3 SELECT
4 b u i l d i n g s h a p e s . o g c f i d ,
5 b u i l d i n g s h a p e s . wkb geometry ,
6 b u i l d i n g s h a p e s . f i d ,
7 b u i l d i n g s h a p e s . p h y s i c a l l e v e l ,
8 b u i l d i n g s h a p e s . r e l h ,
9 b u i l d i n g s h a p e s . area ,
10 b u i l d i n g s h a p e s . compactness ,
11 b u i l d i n g s h a p e s . n f l o o r s
12 FROM london bu i l d i ng s . b u i l d i n g s h a p e s
13 WHERE ( ( b u i l d i n g s h a p e s . n f l o o r s > 0) AND ( b u i l d i n g s h a p e s .
area > (6 ) : : double p r e c i s i o n ) ) ;
14 −− Create indexes
15 ALTER TABLE london bu i l d i ng s . shapes
16 ADD PRIMARY KEY ( o g c f i d ) ;
17 CREATE INDEX shapes wkb geomet ry spat i a l i dx
18 ON london bu i l d i ng s . shapes
121
19 USING g i s t
20 ( wkb geometry ) ;
21 −− Add shapes ’ c e n t r o i d s
22 ALTER TABLE london bu i l d i ng s . shapes
23 ADD COLUMN geom centro ids geometry ;
24 CREATE INDEX s h a p e s c e n t r o i d s s p a t i a l i d x
25 ON london bu i l d i ng s . shapes
26 USING g i s t
27 ( geom centro ids ) ;
28 UPDATE london bu i l d i ng s . shapes SET geom centro ids = ST centro id (
wkb geometry ) ; −− done
29 ALTER TABLE london p lo t s . p l o t s
30 ADD COLUMN geom p lo t c en t ro id s geometry ;
31 UPDATE london p lo t s . p l o t s SET geom p lo t c en t ro id s = ST centro id (
wkb geometry ) ;
32 CREATE INDEX p l o t c e n t r o i d s s p a t i a l i d x
33 ON london p lo t s . p l o t s
34 USING g i s t
35 ( g eom p lo t c en t ro id s ) ;
Listing 8: Data createBuildingShapesTable.sql
1 −− S p a t i a l JOIN blocks−b u i l d i n g s
2 DROP TABLE london index . b l ock index CASCADE;
3 CREATE TABLE london index . b l ock index AS
4 (
5 SELECT a . b lock id ,
6 a . wkb geometry , a . a rea b lock , count (b . area ) AS
bu i ld ing count ,
7 SUM(b . area ) AS t o t a l f o o t p r i n t ,
8 SUM(b . area ∗ n f l o o r s ) AS t o t a l f l o o r s u r f a c e ,
9 SUM(b . area ) /a . a r ea b l o ck AS gs i ,
10 stddev samp (b . area ) /avg (b . area ) AS g s i s d ,
11 SUM(b . area ∗ n f l o o r s ) /a . a r ea b l o ck AS f s i ,
122
12 stddev samp (b . area ∗b . n f l o o r s ) /avg (b . area ∗b . n f l o o r s ) AS
f s i s d ,
13 SUM(b . area ∗b . n f l o o r s ) /SUM(b . area ) AS w avg n f l oo r s
14 FROM london b locks . b locks AS a
15 INNER JOIN london bu i l d i ng s . shapes AS b
16 ON ST Within (b . geom centro ids , a . wkb geometry )
17 GROUP BY a . b lock id , a . wkb geometry , a . a r ea b l o ck
18 ) ;
19 ALTER TABLE london index . b l ock index
20 ADD PRIMARY KEY ( b l o c k i d ) ;
21 CREATE INDEX b l o c k i n d e x s p a t i a l i d x
22 ON london index . b l ock index
23 USING g i s t ( wkb geometry ) ;
24 −− S p a t i a l JOIN plot s−b u i l d i n g s
25 DROP TABLE london index . p l o t i n d e x CASCADE;
26 CREATE TABLE london index . p l o t i n d e x AS
27 (
28 SELECT a . o g c f i d AS p l o t i d ,
29 a . wkb geometry , a . area AS area p l o t , count (b . area ) AS
bu i ld ing count ,
30 SUM(b . area ) AS t o t a l f o o t p r i n t ,
31 SUM(b . area ∗ n f l o o r s ) AS t o t a l f l o o r s u r f a c e ,
32 SUM(b . area ) /a . area AS gs i ,
33 stddev samp (b . area ) /avg (b . area ) AS g s i s d ,
34 SUM(b . area ∗ n f l o o r s ) /a . area AS f s i ,
35 stddev samp (b . area ∗b . n f l o o r s ) /avg (b . area ∗b . n f l o o r s ) AS
f s i s d ,
36 SUM(b . area ∗b . n f l o o r s ) /SUM(b . area ) AS w avg n f l oo r s
37 FROM london p lo t s . p l o t s AS a
38 INNER JOIN london bu i l d i ng s . shapes AS b
39 ON ST Within (b . geom centro ids , a . wkb geometry )
40 GROUP BY a . o g c f i d , a . wkb geometry , a . area
41 ) ;
123
42 ALTER TABLE london index . p l o t i n d e x
43 ADD PRIMARY KEY ( p l o t i d ) ;
44 CREATE INDEX p l o t i n d e x s p a t i a l i d x
45 ON london index . p l o t i n d e x
46 USING g i s t ( wkb geometry ) ;
Listing 9: Data joinBlocks.sql
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Block range 400
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −− DROP TABLE support . t emp b lock cent ro id s CASCADE;
3 CREATE TABLE support . t emp b lock cent ro id s AS (
4 SELECT block id , s t c e n t r o i d ( geom block ) as wkb geometry ,
area b lock ,
5 bu i ld ing count , t o t a l f o o t p r i n t , t o t a l f l o o r s u r f a c e , g s i , f s i
, w avg n f l oo r s
6 FROM london index . b l o c k c l u s t e r l a b e l s )
7 ;
8 ALTER TABLE support . t emp b lock cent ro id s
9 ADD PRIMARY KEY ( b l o c k i d ) ;
10 CREATE INDEX t e m p c e n t r o i d s b l o c k s s p a t i a l i d x
11 ON support . t emp b lock cent ro id s
12 USING g i s t
13 ( wkb geometry ) ;
14 −− DROP TABLE support . t emp b lock bu f f e r s400 CASCADE;
15 CREATE TABLE support . t emp b lock bu f f e r s400 AS (
16 SELECT block id , s t b u f f e r ( s t c e n t r o i d ( geom block ) , 400 , ’
quad segs=2 ’ ) as wkb geometry buffer , geom block FROM
london index . b l o c k c l u s t e r l a b e l s
17 ) ;
18 ALTER TABLE support . t emp b lock bu f f e r s400
19 ADD PRIMARY KEY ( b l o c k i d ) ;
20 CREATE INDEX t e m p b l o c k b u f f e r s 4 0 0 s p a t i a l i d x
21 ON support . t emp b lock bu f f e r s400
124
22 USING g i s t
23 ( wkb geometry buf fer ) ;
24 CREATE INDEX t e m p b l o c k b u f f e r b l o c k s s p a t i a l i d x
25 ON support . t emp b lock bu f f e r s400
26 USING g i s t
27 ( geom block ) ;
28
29 −− DROP TABLE london b lock range . b lock range400 CASCADE;
30 −−CREATE SCHEMA london b lock range
31 −− AUTHORIZATION pos tg r e s ;
32
33 CREATE TABLE london b lock range . b lock range400 AS
34 (
35 SELECT a . b lock id ,
36 a . geom block as wkb geometry ,
37 SUM( bu i l d ing count ) AS bu i ld ing count ,
38 SUM( t o t a l f o o t p r i n t ) AS t o t a l f o o t p r i n t ,
39 SUM( t o t a l f l o o r s u r f a c e ) AS t o t a l f l o o r s u r f a c e ,
40 SUM(b . g s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e )
AS gs i ,
41 stddev samp (b . g s i ∗b . t o t a l f l o o r s u r f a c e ) /AVG(b . g s i ∗b .
t o t a l f l o o r s u r f a c e ) AS g s i s d ,
42 SUM(b . f s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e )
AS f s i ,
43 stddev samp (b . f s i ∗b . t o t a l f l o o r s u r f a c e ) /AVG(b . f s i ∗b .
t o t a l f l o o r s u r f a c e ) AS f s i s d ,
44 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b .
t o t a l f l o o r s u r f a c e ) AS w avg n f loor s ,
45 stddev samp (b . w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) /AVG(b .
w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) AS w a v g n f l o o r s s d
46 FROM support . t emp b lock bu f f e r s400 AS a
47 INNER JOIN support . t emp b lock cent ro id s AS b
48 ON ST Within (b . wkb geometry , a . wkb geometry buf fer )
125
49 GROUP BY a . b l o c k i d
50 ) ;
51 ALTER TABLE london b lock range . b lock range400
52 ADD PRIMARY KEY ( b l o c k i d ) ;
53 CREATE INDEX p l o t r a n g e 4 0 0 b l o c k s p a t i a l i d x
54 ON london b lock range . b lock range400
55 USING g i s t ( wkb geometry ) ;
Listing 10: Data joinBlockRange400.sql
6.4.3 Classification
1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban
Ana lyt i c s at CASA − UCL
2 # This s c r i p t i s used to f e t c h the data from the database and
run the a n a l y s i s
3 import pandas as pd
4 from sqla lchemy import c r e a t e e n g i n e
5 import pandas . i o . s q l as psq l
6 pd . s e t o p t i o n ( ’ d i s p l ay . width ’ , 640)
7
8 #### Create common v a r i a b l e s ####
9 r o o t d i r = ’ / Users / ducc ioa /CLOUD/01 Cloud /01 Work/02 DataSc ience
/UCL SmartCities /08 D i s s e r t a t i o n ’
10 eng ine = c r e a t e e n g i n e ( ’ p o s t g r e s q l : // po s tg r e s : po s t g r e s@ lo ca lho s t
:5432/ msc ’ )
11 g s i l e g e n d = { ’ low coverage ’ : (0 , 0 . 188 ) , ’medium coverage ’ :
( 0 . 1881 , 0 . 277 ) , ’ high coverage ’ : ( 0 . 2771 , 1) } # q u a n t i l e s
12 b u i l d i n g h e i g h t l e g e n d = { ’ low r i s e ’ : (0 , 2 . 5 ) , ’mid−low r i s e ’ :
( 2 . 5 , 6 . 5 ) , ’mid−high r i s e ’ : ( 6 . 5 , 12 . 5 ) , ’ high r i s e ’ : ( 1 2 . 5 ,
200) }
13
14 #### Load Data ####
15 # Create csv from s q l query ( reqding query takes too long ,
e a s i e r to wr i t e / read csv )
126
16 psq l . execute ( ”copy ( S e l e c t ∗ From london index . b l ock index ) To ’
” + r o o t d i r +
17 ”/03 Data/DbDump/ b lock index . csv ’ HEADER CSV; ” ,
18 eng ine )
19 df = pd . r ead c sv ( r o o t d i r + ’ /03 Data/DbDump/ b lock index . csv ’ )
20 ########################################### BLOCK CLASSIFICATION
###########################################
21 ####### S i n g l e b locks #######
22 index = df . l o c [ ( df . g s i <= 1) & ( df . f s i >0) ]
23
24
25 b u i l d i n g h e i g h t l a b e l s =[ ]
26 f o r i in index . w avg n f l oo r s :
27 f o r key , va lue s in b u i l d i n g h e i g h t l e g e n d . i tems ( ) :
28 i f i >= min ( va lue s ) and i <max( va lue s ) :
29 b u i l d i n g h e i g h t l a b e l s . append ( key )
30 pr in t ( l en ( b u i l d i n g h e i g h t l a b e l s ) )
31 g s i l a b e l s =[ ]
32 f o r i in index . g s i :
33 f o r key , va lue s in g s i l e g e n d . i tems ( ) :
34 i f i >= min ( va lue s ) and i <max( va lue s ) +0.0001:
35 g s i l a b e l s . append ( key )
36 pr in t ( l en ( g s i l a b e l s ) )
37 c l a s s i f i c a t i o n =[ ]
38 f o r i in range (0 , l en ( g s i l a b e l s ) ) :
39 s t = b u i l d i n g h e i g h t l a b e l s [ i ] + ’ − ’ + g s i l a b e l s [ i ]
40 c l a s s i f i c a t i o n . append ( s t )
41 pr in t ( l en ( c l a s s i f i c a t i o n ) )
42
43 c l a s s i f i c a t i o n=pd . DataFrame ({ ’ b l o c k i d ’ : index . b lock id , ’ l a b e l ’ :
c l a s s i f i c a t i o n } , index=index . index )
44 index = pd . merge ( index , c l a s s i f i c a t i o n )
45 index . r e p l a c e ( ’NaN ’ , 0 , i n p l a c e=True )
127
46 index = index [ ˜ index . b l o c k i d . i s i n ( [ 60762 , 30571 , 60769 , 30497 ] )
]
47 summary = index . groupby ( ’ l a b e l ’ ) . d e s c r i b e ( )
48 summary . t o c s v ( r o o t d i r + ’ /03 Data/DbDump/block summary . csv ’ )
49 index . t o c s v ( r o o t d i r + ’ /03 Data/DbDump/ b l o c k c l a s s i f i c a t i o n .
csv ’ , index=False , i n d e x l a b e l=False )
50
51 psq l . execute ( ”DROP TABLE london index . b l o c k c l u s t e r l a b e l s
CASCADE; ”
52 ”CREATE TABLE london index . b l o c k c l u s t e r l a b e l s ( ”
53 ” b l o c k i d text , ”
54 ” geom block geometry , ”
55 ” a r ea b l o ck f l o a t , ”
56 ” bu i l d ing count int , ”
57 ” t o t a l f o o t p r i n t f l o a t , ”
58 ” t o t a l f l o o r s u r f a c e f l o a t , ”
59 ” g s i f l o a t , ”
60 ” g s i s d f l o a t , ”
61 ” f s i f l o a t , ”
62 ” f s i s d f l o a t , ”
63 ” w avg n f l oo r s f l o a t , ”
64 ” l a b e l varchar (255) ) ; ” , eng ine )
65 psq l . execute ( ”copy london index . b l o c k c l u s t e r l a b e l s from ’/
Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03
Data/DbDump/ b l o c k c l a s s i f i c a t i o n . csv ’ CSV HEADER; ” , eng ine )
66 pr in t ( ’ done ’ )
67 psq l . execute ( ”ALTER TABLE london index . b l o c k c l u s t e r l a b e l s ADD
PRIMARY KEY ( b l o c k i d ) ; ” , eng ine )
68 pr in t ( ’ done ’ )
69 psq l . execute ( ”CREATE INDEX b l o c k c l u s t e r g e o m i d x ON
london index . b l o c k c l u s t e r l a b e l s USING g i s t ( geom block ) ; ” ,
eng ine )
70 pr in t ( ’ done ’ )
128
71
72 ####### Range 400 #######
73 #### Load Data ####
74 # Create csv from s q l query ( reqding query takes too long ,
e a s i e r to wr i t e / read csv )
75 psq l . execute ( ”copy ( S e l e c t ∗ From london b lock range .
b lock range400 ) To ’/ Users / ducc ioa /CLOUD/C07 UCL SmartCities
/08 D i s s e r t a t i o n /03 Data/DbDump/ block range400 . csv ’ HEADER
CSV; ” ,
76 eng ine )
77 df400 = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/DbDump/ block range400 . csv ’ )
78 index400 = df400 . l o c [ ( df400 . g s i <= 1) & ( df400 . f s i >0) ]
79 ## Manual c l a s s i f i c a t i o n
80 b u i l d i n g h e i g h t l a b e l s 4 0 0 =[ ]
81 f o r i in index400 . w avg n f l oo r s :
82 f o r key , va lue s in b u i l d i n g h e i g h t l e g e n d . i tems ( ) :
83 i f i >= min ( va lue s ) and i <max( va lue s ) :
84 b u i l d i n g h e i g h t l a b e l s 4 0 0 . append ( key )
85 pr in t ( l en ( b u i l d i n g h e i g h t l a b e l s 4 0 0 ) )
86 g s i l a b e l s 4 0 0 =[ ]
87 f o r i in index400 . g s i :
88 f o r key , va lue s in g s i l e g e n d . i tems ( ) :
89 i f i >= min ( va lue s ) and i <max( va lue s ) +0.0001:
90 g s i l a b e l s 4 0 0 . append ( key )
91 pr in t ( l en ( g s i l a b e l s 4 0 0 ) )
92 c l a s s i f i c a t i o n 4 0 0 =[ ]
93 f o r i in range (0 , l en ( g s i l a b e l s 4 0 0 ) ) :
94 s t = b u i l d i n g h e i g h t l a b e l s 4 0 0 [ i ] + ’ − ’ + g s i l a b e l s 4 0 0 [ i ]
95 c l a s s i f i c a t i o n 4 0 0 . append ( s t )
96 pr in t ( l en ( c l a s s i f i c a t i o n 4 0 0 ) )
97
98 c l a s s i f i c a t i o n 4 0 0=pd . DataFrame ({ ’ b l o c k i d ’ : index400 . b lock id , ’
129
l a b e l ’ : c l a s s i f i c a t i o n 4 0 0 } , index=index400 . index )
99 index400 = pd . merge ( index400 , c l a s s i f i c a t i o n 4 0 0 )
100 index400 . r e p l a c e ( ’NaN ’ , 0 , i n p l a c e=True )
101 summary = index400 . groupby ( ’ l a b e l ’ ) . d e s c r i b e ( )
102 summary . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/DbDump/block400 summary . csv ’ )
103 index400 . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/DbDump/ b l o c k c l a s s i f i c a t i o n 4 0 0 . csv ’ ,
index=False , i n d e x l a b e l=False )
104 psq l . execute ( ”DROP TABLE london index . b l o c k c l u s t e r l a b e l s 4 0 0 ; ”
105 ”CREATE TABLE london index . b l o c k c l u s t e r l a b e l s 4 0 0
( ”
106 ” b l o c k i d text , ”
107 ” geom block geometry , ”
108 ” bu i l d ing count int , ”
109 ” t o t a l f o o t p r i n t f l o a t , ”
110 ” t o t a l f l o o r s u r f a c e f l o a t , ”
111 ” g s i f l o a t , ”
112 ” g s i s d f l o a t , ”
113 ” f s i f l o a t , ”
114 ” f s i s d f l o a t , ”
115 ” w avg n f l oo r s f l o a t , ”
116 ” w a v g n f l o o r s s d f l o a t , ”
117 ” l a b e l varchar (255) ) ; ” , eng ine )
118 psq l . execute ( ”copy london index . b l o c k c l u s t e r l a b e l s 4 0 0 from ’/
Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03
Data/DbDump/ b l o c k c l a s s i f i c a t i o n 4 0 0 . csv ’ CSV HEADER; ” ,
eng ine )
119 pr in t ( ’ done ’ )
120 psq l . execute ( ”ALTER TABLE london index . b l o c k c l u s t e r l a b e l s 4 0 0
ADD PRIMARY KEY ( b l o c k i d ) ; ” , eng ine )
121 pr in t ( ’ done ’ )
122 psq l . execute ( ”CREATE INDEX block c lu s t e r 400 geom idx ON
130
l ondon index . b l o c k c l u s t e r l a b e l s 4 0 0 USING g i s t ( geom block ) ;
” , eng ine )
123 pr in t ( ’ done ’ )
Listing 11: Data classification blocks.py
6.4.4 Income and population density
1 −− DROP TABLE london index . msoa11 index CASCADE;
2 CREATE TABLE london index . msoa11 index AS
3 (
4 SELECT a . gid , a . msoa11cd ,
5 a . geom as wkb geometry ,
6 a . median income 2012 13 , a . people per sq km , a .
mid 2014 populat ion ,
7 SUM(b . g s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS
gs i ,
8 SUM(b . f s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS
f s i ,
9 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s p a c e ) /SUM(b .
t o t a l f l o o r s p a c e ) AS w avg n f l oo r s
10 FROM london . msoa11 pop AS a
11 INNER JOIN support . t e m p p l o t c e n t r o i d s AS b
12 ON ST Within (b . wkb geometry , a . geom)
13 WHERE b . gs i <1
14 GROUP BY a . gid , a . msoa11cd , a . geom , a . median income 2012 13 , a .
people per sq km , a . mid 2014 populat ion
15 ) ;
16 ALTER TABLE london index . msoa11 index
17 ADD PRIMARY KEY ( gid ) ;
18 CREATE INDEX m so a 1 1 i n d ex s p a t i a l i d x
19 ON london index . msoa11 index
20 USING g i s t ( wkb geometry ) ;
21
22 −− DROP TABLE london index . l s o a 11 i n de x CASCADE;
131
23 CREATE TABLE london index . l s o a 11 i n de x AS
24 (
25 SELECT a . gid , a . l soa11cd ,
26 a . geom as wkb geometry ,
27 a . median income 2012 13 , a . people per sq km , a .
mid 2014 populat ion ,
28 SUM(b . g s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS
gs i ,
29 SUM(b . f s i ∗b . t o t a l f l o o r s p a c e ) /SUM(b . t o t a l f l o o r s p a c e ) AS
f s i ,
30 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s p a c e ) /SUM(b .
t o t a l f l o o r s p a c e ) AS w avg n f l oo r s
31 FROM london . l soa11 pop AS a
32 INNER JOIN support . t e m p p l o t c e n t r o i d s AS b
33 ON ST Within (b . wkb geometry , a . geom)
34 WHERE b . gs i <1
35 GROUP BY a . gid , a . l soa11cd , a . geom , a . median income 2012 13 , a .
people per sq km , a . mid 2014 populat ion
36 ) ;
37 ALTER TABLE london index . l s o a 11 i n de x
38 ADD PRIMARY KEY ( gid ) ;
39 CREATE INDEX l s o a 1 1 i n d e x s p a t i a l i d x
40 ON london index . l s o a 11 i n de x
41 USING g i s t ( wkb geometry ) ;
Listing 12: Data joinIncomeTable.sql
1 # By Duccio Aiazz i as part o f the MSc Smart C i t i e s adn Urban
Ana lyt i c s at CASA − UCL
2 # This s c r i p t i s used to f e t c h the data from the database and
run the a n a l y s i s
3 # F i r s t run Data import LondonAdministrat iveBoundaries . py
4 import pandas . i o . s q l as psq l
5 from subproces s import run
132
6 import datet ime
7 import psycopg2
8 from sqla lchemy import c r e a t e e n g i n e
9 import pandas as pd
10 pd . s e t o p t i o n ( ’ d i s p l ay . width ’ , 640)
11
12 #### Create common v a r i a b l e s ####
13 eng ine = c r e a t e e n g i n e ( ’ p o s t g r e s q l : // po s tg r e s : po s t g r e s@ lo ca lho s t
:5432/ msc ’ )
14 g s i l e g e n d = { ’ low coverage ’ : (0 , 0 . 3 3 ) , ’medium coverage ’ :
( 0 . 330001 , 0 . 6 6 ) , ’ high coverage ’ : ( 0 . 660001 , 1) }
15 b u i l d i n g h e i g h t l e g e n d = { ’ low r i s e ’ : (0 , 2 . 5 ) , ’mid−low r i s e ’ :
( 2 . 5 , 6 . 5 ) , ’mid−high r i s e ’ : ( 6 . 5 , 12 . 5 ) , ’ high r i s e ’ : ( 1 2 . 5 ,
200) }
16
17 ## London LSOA11
18 path = ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n
/03 Data/London/ Admini s t rat ive Census Boundar ie s / s t a t i s t i c a l −
g i s−boundaries−london /ESRI/ ’
19 f i l ename = ’LSOA 2011 London gen MHW . shp ’
20 s h a p e f i l e = path + f i l ename
21 schema = ’ london . ’
22 t ab l e = ’ l s oa11 ’
23 opt ions = ’−I −s 27700 ’
24 s e r v e r = ’ | psq l −d msc −U pos tg r e s −W’
25 terminal command = ’ shp2pgsql %s %s %s%s %s ’ %(opt ions ,
s h a p e f i l e , schema , tab le , s e r v e r )
26 dt = datet ime . datet ime . now ( )
27 pr in t ( ’ Importing ”%s ” ’ %(f i l ename ) )
28 pr in t ( ’ S ta r t i ng Time : ’+s t r ( dt . hour ) . z f i l l ( 2 ) + ’ : ’ + s t r ( dt .
minute ) . z f i l l ( 2 ) )
29 r e turn code = run ( terminal command , s h e l l=True ) # import the
f i r s t f i l e to c r e a t e the ta b l e
133
30 dt = datet ime . datet ime . now ( )
31
32 ## Populat ion data
33 # Import to Pos tg r e sq l
34 df = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/London/ Admini s t rat ive Census Boundar ie s
/ land−area−populat ion−dens i ty−l s oa11 . csv ’ )
35 df . columns = [ c . lower ( ) f o r c in df . columns ]
36 df . columns = [ c . r e p l a c e ( ’ ’ , ’ ’ ) f o r c in df . columns ]
37 df . columns = [ c . r e p l a c e ( ’− ’ , ’ ’ ) f o r c in df . columns ]
38 eng ine = c r e a t e e n g i n e ( ’ p o s t g r e s q l : // po s tg r e s : po s t g r e s@ lo ca lho s t
:5432/ msc ’ )
39 df . t o s q l ( ” l s o a 1 1 p o pu l a t i o n ” , engine , schema=’ support ’ )
40 df = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/London/ O f f i c e N a t i o n a l S t a t i s t i c s / l soa−
household−income−e s t imate s . csv ’ )
41 df . columns = [ c . lower ( ) f o r c in df . columns ]
42 df . columns = [ c . r e p l a c e ( ’ ’ , ’ ’ ) f o r c in df . columns ]
43 df . columns = [ c . r e p l a c e ( ’− ’ , ’ ’ ) f o r c in df . columns ]
44 eng ine = c r e a t e e n g i n e ( ’ p o s t g r e s q l : // po s tg r e s : po s t g r e s@ lo ca lho s t
:5432/ msc ’ )
45 df . t o s q l ( ” l soa11 income ” , engine , schema=’ support ’ )
46
47 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
48 cur = conn . cur so r ( )
49 cur . execute ( ’ ’ ’
50 DROP TABLE IF EXISTS london . l soa11 pop CASCADE;
51 CREATE TABLE london . l soa11 pop AS (
52 SELECT t1 . gid , t1 . l soa11cd , t2 . lsoa name , t1 . geom , t2 .
median income 2012 13 , t2 . people per sq km , t2 .
mid 2014 populat ion
53 FROM london . l s oa11 t1
134
54 JOIN (SELECT t1 . index , t2 . l soa11 code , t2 . lsoa name , t2 .
median income 2012 13 ,
55 t1 . people per sq km , t1 . mid 2014 populat ion FROM support
. l s o a 11 p o pu l a t i o n t1
56 JOIN support . l soa11 income t2
57 ON t1 . l s oa11 code=t2 . l s oa11 code ) t2
58 ON t1 . l soa11cd=t2 . l s oa11 code
59 ) ;
60 ALTER TABLE london . l soa11 pop
61 ADD PRIMARY KEY ( gid ) ;
62 CREATE INDEX lsoa11 pop geom idx
63 ON london . l soa11 pop
64 USING g i s t (geom) ;
65 DROP TABLE support . l s o a 1 1 p o pu l a t i o n CASCADE;
66 DROP TABLE support . l soa11 income CASCADE;
67 DROP TABLE london . l s oa11 CASCADE;
68 ’ ’ ’ )
69 conn . commit ( )
70 conn . c l o s e ( )
71 #S p a t i a l j o i n
72 conn = psycopg2 . connect ( database=”msc” , user=” pos tg r e s ” ,
password=” pos tg r e s ” , host=” l o c a l h o s t ” , port=”5432” )
73 cur = conn . cur so r ( )
74 cur . execute ( ’ ’ ’
75 DROP TABLE IF EXISTS london index . l s o a 11 i n de x CASCADE;
76 CREATE TABLE london index . l s o a 11 i n de x AS
77 (
78 SELECT a . gid , a . l soa11cd ,
79 a . geom as wkb geometry , a . lsoa name ,
80 a . median income 2012 13 , a . people per sq km , a .
mid 2014 populat ion ,
81 SUM(b . g s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e
) AS gs i ,
135
82 SUM(b . f s i ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b . t o t a l f l o o r s u r f a c e
) AS f s i ,
83 SUM(b . w avg n f l oo r s ∗b . t o t a l f l o o r s u r f a c e ) /SUM(b .
t o t a l f l o o r s u r f a c e ) AS w avg n f l oo r s
84 FROM london . l soa11 pop AS a
85 INNER JOIN support . t e m p p l o t c e n t r o i d s AS b
86 ON ST Within (b . wkb geometry , a . geom)
87 WHERE b . gs i <1
88 GROUP BY a . gid , a . l soa11cd , a . geom , a . median income 2012 13 ,
a . people per sq km , a . mid 2014 populat ion , a . lsoa name
89 ) ;
90 ALTER TABLE london index . l s o a 11 i n de x
91 ADD PRIMARY KEY ( gid ) ;
92 CREATE INDEX l s o a 1 1 i n d e x s p a t i a l i d x
93 ON london index . l s o a 11 i n de x
94 USING g i s t ( wkb geometry ) ;
95 ’ ’ ’ )
96 conn . commit ( )
97 conn . c l o s e ( )
98 pr in t ( ’ done ’ )
99 # Create csv from s q l query ( reqding query takes too long ,
e a s i e r to wr i t e / read csv )
100 psq l . execute ( ”copy ( S e l e c t ∗ From london index . l s o a 11 i n de x ) To
’/ Users / ducc ioa /CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03
Data/DbDump/ l s o a 1 1 i n de x . csv ’ HEADER CSV; ” ,
101 eng ine ) ;
102 index = pd . r ead c sv ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/DbDump/ l s o a 1 1 i n de x . csv ’ )
103
104 ## Manual c l a s s i f i c a t i o n
105 g s i l e g e n d = { ’ low coverage ’ : (0 , 0 . 188 ) , ’medium coverage ’ :
( 0 . 188001 , 0 . 277 ) , ’ high coverage ’ : ( 0 . 277001 , 1) } #
q u a n t i l e s
136
106 b u i l d i n g h e i g h t l e g e n d = { ’ low r i s e ’ : (0 , 2 . 5 ) , ’mid−low r i s e ’ :
( 2 . 5 , 6 . 5 ) , ’mid−high r i s e ’ : ( 6 . 5 , 12 . 5 ) , ’ high r i s e ’ : ( 1 2 . 5 ,
200) }
107
108 b u i l d i n g h e i g h t l a b e l s =[ ]
109 f o r i in index . w avg n f l oo r s :
110 f o r key , va lue s in b u i l d i n g h e i g h t l e g e n d . i tems ( ) :
111 i f i >= min ( va lue s ) and i <max( va lue s ) :
112 b u i l d i n g h e i g h t l a b e l s . append ( key )
113 pr in t ( l en ( b u i l d i n g h e i g h t l a b e l s ) )
114 g s i l a b e l s =[ ]
115 f o r i in index . g s i :
116 f o r key , va lue s in g s i l e g e n d . i tems ( ) :
117 i f i >= min ( va lue s ) and i <max( va lue s ) +0.0000001:
118 g s i l a b e l s . append ( key )
119 pr in t ( l en ( g s i l a b e l s ) )
120 c l a s s i f i c a t i o n =[ ]
121 f o r i in range (0 , l en ( g s i l a b e l s ) ) :
122 s t = b u i l d i n g h e i g h t l a b e l s [ i ] + ’ − ’ + g s i l a b e l s [ i ]
123 c l a s s i f i c a t i o n . append ( s t )
124 pr in t ( l en ( c l a s s i f i c a t i o n ) )
125
126 c l a s s i f i c a t i o n=pd . DataFrame ({ ’ g id ’ : index . gid , ’ l a b e l ’ :
c l a s s i f i c a t i o n } , index=index . index )
127 index = pd . merge ( index , c l a s s i f i c a t i o n )
128 summary = index . groupby ( ’ l a b e l ’ ) . d e s c r i b e ( )
129 summary . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/DbDump/ lasoa summary . csv ’ )
130 index . t o c s v ( ’ / Users / ducc ioa /CLOUD/C07 UCL SmartCities /08
D i s s e r t a t i o n /03 Data/DbDump/ l s o a c l a s s i f i c a t i o n . csv ’ , index=
False , i n d e x l a b e l=False )
131 psq l . execute ( ”DROP TABLE IF EXISTS london index . l s o a l a b e l s
CASCADE; ”
137
132 ”CREATE TABLE london index . l s o a l a b e l s ( ”
133 ” gid int , ”
134 ” l soa11cd text , ”
135 ” wkb geometry geometry , ”
136 ” lsoa name text , ”
137 ” median income 2012 13 int , ”
138 ” peop le per sq km f l o a t , ”
139 ” mid 2014 populat ion int , ”
140 ” g s i f l o a t , ”
141 ” f s i f l o a t , ”
142 ” w avg n f l oo r s f l o a t , ”
143 ” l a b e l varchar (255) ) ; ”
144 ”DROP TABLE IF EXISTS london index . l s o a 11 i n de x
CASCADE; ” , eng ine )
145 psq l . execute ( ”copy london index . l s o a l a b e l s from ’/ Users / ducc ioa
/CLOUD/C07 UCL SmartCities /08 D i s s e r t a t i o n /03 Data/DbDump/
l s o a c l a s s i f i c a t i o n . csv ’ CSV HEADER; ” , eng ine )
146 pr in t ( ’ done ’ )
147 psq l . execute ( ”ALTER TABLE london index . l s o a l a b e l s ADD PRIMARY
KEY ( gid ) ; ” , eng ine )
148 pr in t ( ’ done ’ )
149 psq l . execute ( ”CREATE INDEX l s o a l a b e l s s p a t i a l i d x ON
london index . l s o a l a b e l s USING g i s t ( wkb geometry ) ; ” , eng ine )
150 pr in t ( ’ done ’ )
Listing 13: Data classification income.py
138
top related