Top Banner
CITTA 3 rd Annual Conference on Planning Research Bringing City Form Back Into Planning 1 On the discovery of urban typologies Jorge Gil, Nuno Montenegro, José Nuno Beirão, José Pinto Duarte Faculty of Architecture TU Delft / TU Lisbon, [email protected] Phone number: 00 31 152783885 When pursuing a more sustainable and integrative urban development, the first stage of the urban design process should consist of a pre-design phase where the context of the site is analysed both qualitatively and quantitatively. This information provides a base line for the contextualisation of the urban programme, of the design solutions and of the evaluation benchmarks proposed for the site. Our research project aims to develop an urban design system using an urban ontology that can be applied to the formulation, generation and evaluation of urban plans. The purpose of this urban design system is: (1) formulation - to read data from the site context on a GIS platform and then generate adequate program descriptions, given the contextual conditions; (2) generation - to generate alternative design solutions that match the program, and (3) evaluation - to evaluate evolving design solutions against the program to obtain satisfactory results. In this paper we present a methodology for data mining an urban Geographic Information System (GIS) data set, consisting of three main phases: representation, analysis and description. The process reveals a series of block and street typologies that highlight the different character of two neighbourhoods. This methodology is demanding in the preparation phase and requires a high level of GIS and statistics expertise in the analysis phase. However, it successfully addresses the complex multi-scale and multi- level nature of cities in a systematic way, providing a tool for systematic profiling of neighbourhoods, which is site and problem specific. Keywords: Sustainable urban development; GIS; data mining; urban typologies; neighbourhood profiling 1 Introduction When pursuing a more sustainable and integrative urban development, the first stage of the urban design process should consist of a pre-design phase (Montenegro and Duarte, 2008) where the context of the site is analysed both qualitatively and quantitatively. This information provides a base-line for the contextualisation of the urban programme and of the design solutions proposed for the site. The definition and description of urban patterns appears to be a useful way to translate the urban design requirements into a format adequate for flexible, and parametric rule-based design processes. We have recently seen statistical classification techniques applied to building typology (Reffat, 2008) and urban block form (Laskari, 2007) where archetypes are identified. Can similar techniques provide an efficient method for the classification and description of urban typologies from the base- line information gathered in the pre-design phase? In this paper we present a methodology for data mining a Geographic Information System (GIS) data set of two neighbourhoods in the city of Lisbon, Portugal, in order to identify urban typologies of blocks and streets. Firstly, we review previous work on urban analysis and urban patterns and introduce the concept of data mining as a tool for multivariate emergent classification that can be applied to architecture and planning. The objective is to develop and test a data mining methodology for urban environment data sets to extract site specific typologies and archetypes based on a variety of attributes from different disciplines of urban morphological studies to obtain a more integrated perspective.
14

On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

May 17, 2023

Download

Documents

Jorge Rocha
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

CITTA 3rd

Annual Conference on Planning Research Bringing City Form Back Into Planning

1

On the discovery of urban typologies

Jorge Gil, Nuno Montenegro, José Nuno Beirão, José Pinto Duarte

Faculty of Architecture

TU Delft / TU Lisbon, [email protected]

Phone number: 00 31 152783885

When pursuing a more sustainable and integrative urban development, the first stage of the urban design process should consist of a pre-design phase where the context of the site is analysed both qualitatively and quantitatively. This information provides a base line for the contextualisation of the urban programme, of the design solutions and of the evaluation benchmarks proposed for the site. Our research project aims to develop an urban design system using an urban ontology that can be applied to the formulation, generation and evaluation of urban plans. The purpose of this urban design system is: (1) formulation - to read data from the site context on a GIS platform and then generate adequate program descriptions, given the contextual conditions; (2) generation - to generate alternative design solutions that match the program, and (3) evaluation - to evaluate evolving design solutions against the program to obtain satisfactory results. In this paper we present a methodology for data mining an urban Geographic Information System (GIS) data set, consisting of three main phases: representation, analysis and description. The process reveals a series of block and street typologies that highlight the different character of two neighbourhoods. This methodology is demanding in the preparation phase and requires a high level of GIS and statistics expertise in the analysis phase. However, it successfully addresses the complex multi-scale and multi-level nature of cities in a systematic way, providing a tool for systematic profiling of neighbourhoods, which is site and problem specific.

Keywords: Sustainable urban development; GIS; data mining; urban typologies; neighbourhood profiling

1 Introduction

When pursuing a more sustainable and integrative urban development, the first stage of the urban

design process should consist of a pre-design phase (Montenegro and Duarte, 2008) where the

context of the site is analysed both qualitatively and quantitatively. This information provides a

base-line for the contextualisation of the urban programme and of the design solutions proposed for

the site. The definition and description of urban patterns appears to be a useful way to translate the

urban design requirements into a format adequate for flexible, and parametric rule-based design

processes.

We have recently seen statistical classification techniques applied to building typology (Reffat,

2008) and urban block form (Laskari, 2007) where archetypes are identified. Can similar techniques

provide an efficient method for the classification and description of urban typologies from the base-

line information gathered in the pre-design phase?

In this paper we present a methodology for data mining a Geographic Information System (GIS)

data set of two neighbourhoods in the city of Lisbon, Portugal, in order to identify urban typologies

of blocks and streets.

Firstly, we review previous work on urban analysis and urban patterns and introduce the concept of

data mining as a tool for multivariate emergent classification that can be applied to architecture and

planning. The objective is to develop and test a data mining methodology for urban environment

data sets to extract site specific typologies and archetypes based on a variety of attributes from

different disciplines of urban morphological studies to obtain a more integrated perspective.

Page 2: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

Author 1, Author 2 (First name initial and last name) Title of paper

2

We then present the overall stages of the methodology, describe the operations and the outcomes

of each stage, and at the end present the results obtained from the case study. We conclude with

an appraisal of the process, highlighting its strengths, shortcomings and future work.

2 Spatial Assessment of Neighbourhoods

The urban space is perceived, apprehended, and seized by humans. Lynch (1960) wrote that users

understood their surroundings in consistent and predictable ways by forming mental maps. The

classic method of site analysis, deeply embedded in Lynch‟s principles, is elaborated through a

collection of visual annotations. Nevertheless such a process reveals a variety of flaws, due to

cognitive and cultural constraints of the observer. In an attempt to overcome such limitations in the

planning process, it is necessary to implement a collection of quantitative analysis and assessment

tools aiming to assist the planner.

With the introduction of the concept of pattern languages, Alexander, Ishikawa and Silverstein

(1977) offer „patterns‟ as a tool for the systematic description of urban entities. In a catalogue

aiming to achieve a certain ideal of urbanity, it is an explicit attempt to address the multi-scale,

multifaceted and relational complexity of urban environments. The urban design community hasn‟t

adopted this catalogue because the design codes seem outdated and suited only to a very specific

geographic and socio-cultural context. The concept of pattern as a best practice tool has

nevertheless flourished in other fields, like computer science. We consider that this concept can still

be useful in the urban development process if we find efficient ways to update the available

patterns to be problem and context specific.

An important step in this approach comes from Rapoport (1990) through Environment and

Behaviour Studies (EBS). He suggests that grain, texture and complexity are qualities of the

environment that can be derived or read from data and quantities. His work introduces the

systematic analysis of the environment, looking at the object and the context around the object,

taking data from the past (historical data) and from the present, to construct concepts, models and

theories. The derived precedents or archetypes must not be applied directly in a historicist fashion,

but serve as patterns and process.

2.1 Description and classification of urban typologies

There are several studies that advance into the detailed analysis of urban environments, offering

different methods to describe and classify urban entities to obtain urban typologies.

Urhahn and Bobic (1994) identifies principles of good city life and catalogues urban

neighbourhoods through a quantitative and qualitative description. He covers several scales, from

city to district, block and building and uses different classifications themes for description, including

form, density, land use, and mobility infrastructure. The final presentation is textual for complex

dimensions such as context and accessibility, quantitative for the built form, and highly visual,

displaying the various attributes of each area in a disaggregate format. Interestingly it formally

ignores the street as classification entity, although it receives a brief mention in some descriptions.

Streets receive full attention from Stephen Marshall (2004) underlining the importance of urban

layout and configuration for urban quality. He criticises the misuse of typologies to mime

Page 3: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

CITTA 3rd

Annual Conference on Planning Research Bringing City Form Back Into Planning

3

appearance and structure of urban neighbourhoods missing the essence of streets in the process.

Furthermore he exposes the limitations of classification and catalogues as they offer a univariate

interpretation on a theme, resulting in a fragmented vision of the phenomenon. Marshall uses

quantitative attributes relating to configuration, composition, complexity, and constitution of streets,

combined in triangular multivariate charts, to define street typologies.

A similar multivariate approach is taken by Berghauser Pont & Haupt (2004) in relation to

neighbourhoods and urban blocks, around the theme of development density using a set of four

built up area indices. A novel aspect is that they create an interactive on-line tool so that users can

systematically compare existing or planned neighbourhoods against the ones on their catalogue for

a characterisation through precedent (http://www.permeta.nl/spacemate/index2.html: April 2010).

Marshall and Berghauser Pont & Haupt have to restrict their themes to three or four variables in

order to achieve a way of visualising and defining their typologies. But the sustainable urban

development process has a degree of complexity that requires the consideration of many

morphological, socio-economic, environmental and cultural attributes.

2.2 Data mining

The data mining process is characterized by a recursive withdrawal procedure enthused by a

statistical platform towards data emergence, and is commonly used to perform three alternative

tasks (Fayyad, 1996):

1) Classification - arranging the data into predefined groups,

2) Clustering - where the groups are not predefined and the algorithm tries to group similar

items together,

3) Regression - to find a function that models the data with the least error.

Technically data mining is the process of finding data correlations or data patterns amongst dozens

of fields in large relational databases. In this paper we perform data mining using a clustering

technique.

The relevance of these techniques to the planning process is that they allow users to analyse the

environment from different angles simultaneously, categorise it, and summarise the relationships

identified. Data mining seems to facilitate the discovery of data patterns that would be difficult to

reveal in a complex urban space, today controlled by a bursting environment of economic and

social phenomena.

Some examples of the use of data mining in architecture research can be found for buildings,

defining archetypal office building layouts (Hannah, 2007) and Arabic house typologies (Reffat,

2008), and for urban block morphology, in terms of shape and density (Laskari, 2007). These

examples demonstrate that using methods of semi-automatic classification according to multiple

variables reveals typologies in a systematic way and may be used to better understand typologies

of the urban space and the relationships amongst their variables.

3 Research objectives

Page 4: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

Author 1, Author 2 (First name initial and last name) Title of paper

4

The “City Induction” research project aims to develop an urban design tool using an urban ontology

that can be applied to the formulation, generation and evaluation of urban plans. The purpose of

this urban design tool is:

1) formulation - to read data from the site context on a Geographic Information System (GIS)

platform and then generate adequate program descriptions, given the contextual

conditions;

2) generation - to generate alternative design solutions that match the program;

3) evaluation - to evaluate evolving design solutions against the program to obtain satisfactory

results.

Within this framework, this paper‟s aim is to develop and test a context analysis methodology for

urban environment GIS data sets through the application of data mining techniques to two levels of

the urban ontology, streets and blocks. We will be using the recommendations in Witten and Frank

(2005) in a process of reverse engineering, where from the existing environment we extract

descriptions of street and block typologies to be used as precedents in formulation and obtain sets

of rule constraints that can support a parametric rule-based design process in generation.

4 Data mining methodology

The proposed methodology has three main phases, namely representation, analysis and

description (1 – 3). It involves the work of all three modules of the tool, formulation, generation and

evaluation, in the following tasks:

1) Representation

a. Selection of classification attributes

b. Preparation of the plans

c. Integration in the GIS, when required

2) Analysis

a. Spatial analysis of plans

b. Statistical analysis and clustering of attributes

3) Description

a. Statistical profiling of clusters

b. Semantic description of typologies

c. Extraction of design rule constraints

Next we go through the three phases as applied to our case study, describing the key steps and

lessons learned.

4.1 Two Lisbon neighbourhoods

The case study consists of two adjacent, but different in character, neighbourhoods in Lisbon,

Portugal (Figure 1). The first is the Expo 98 PP4, the northern most detail plan for the 1998 world

exhibition site, which is a contemporary neighbourhood, planned from scratch on a brown field site

and developed over the last 10 years. The adjacent Moscavide is a neighbourhood founded in 1928

Page 5: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

CITTA 3rd

Annual Conference on Planning Research Bringing City Form Back Into Planning

5

and developed more slowly over the following decades suffering from densification, in particular

inside the urban blocks, due to a strongly bounded geographic location without room for expansion.

Can this method identify different typologies between the two sites? Are there elements in

common?

5 Representation

The representation phase involves the selection of classification attributes and the preparation of

the geometric data that constitutes the plan, according to the urban ontology elements, keeping

their topological relations. All the information is gathered in a GIS to build an urban morphology

data base.

5.1 Selection of classification attributes

“The best way to select relevant attributes is manually, based on a deep understanding of the

learning problem and what the attributes actually mean.” (Witten and Frank, 2005)

Since the attributes must be meaningful in relation to the sustainable urban development problem,

the urban program includes an extensive list of attributes that are linked to sustainable urban form

covering bioclimatic, morphological, configurational, socio-economic and cultural aspects

(Higueras, 2006; Uhrahn and Bobic, 2004; Berghauser Pont and Haupt, 2004; Marshall, 2004;

Hillier and Iida, 2005). To develop the methodology, from this list we select block and street

attributes focusing on aspects of morphology and land use for both entities, and density for blocks

and configuration for streets (Table 1). We combine a set of attributes that cover different domains

of urban form research to demonstrate the potential of this method for cross-disciplinary studies.

Table 1. List of the selected block and street attributes

Attribute Entity Code Calculation

Length Street, Block LEN m

Width Street, Block W m

Orientation Street, Block DIR degrees

Solar Orientation Street, Block SOLO N,S,E,W

Number of Buildings Street, Block BLDN integer

Area Block TA m2

Built-up area Block BA m2

Perimeter Block PER m

Proportion Block PROP LEN/W

Compactness Block CMP A/PER

Floor Area Ratio Block FAR GFA / TA

Ground Space Index Block GSI BA / TA

Layers Block L GFA/BA

Open Space Ratio Block OSR (TA - BA) / GFA

Private space area Block PRVA m2

Page 6: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

Author 1, Author 2 (First name initial and last name) Title of paper

6

Public space area Block PUBA m2

Pavement width Street PAVW m

Pedestrian area Street PEDA m2

Continuity Street CNT Links

Connectivity Street CON Degree

Global accessibility Street ACCG Closeness

Local accessibility Street ACCL Closeness

Global movement flow Street MOVG Betweenness

Local movement flow Street MOVL Betweenness

5.2 Preparation of the plans

The proposed methodology requires the use of GIS vector features representing urban entities

linked to descriptive information like number of floors, number of dwellings and land use. This is

different from a conventional geometric representation used in urban design where the urban

elements are not whole topological entities but are composed of lines that define shared

boundaries, and their attributes are encoded in graphic formatting or layers.

In our case study we have used CAD software to create layers to reflect the urban ontology and

organise levels of information that need to be captured, e.g. building type, then edited the available

plans to match these criteria. The other important task was to correct and complete those plans to

define polygonal objects for buildings, plots, blocks, streets and pavements that correspond to

correct topological entities. The typical operation is the closing of polylines and the joining of the

adjacent polygon‟s edges and the task can be more elaborate depending on the method for

representing limits and boundaries used in the original drawings. For this manual editing purpose

CAD software seems to be more flexible.

As for the streets‟ network it was also represented as a linear model based on road centre lines,

which had to be verified for connectedness, and a space syntax model based on an axial map of

the city.

5.3 Integration in the GIS

When importing the geometry it is important to specify the correct coordinate system for each

source so that they overlay correctly, in this case we used the Portuguese National System D73. If

the coordinates in CAD don‟t comply with a geographic coordinate system, the plans have to be

moved to a correct reference location. At this stage we have to verify the imported geometry,

converting any remaining closed polylines to polygons and creating the polygonal objects with

holes.

We then add columns to each entity for its descriptive information, like names, codes, land uses or

number of floors. First we transfer the information contained in text layers to the relevant entities,

the most important being any unique ID code that enables linking these entities to urban plan data

available in other formats, like CSV or XLS.

Once we complete the preparation of the features describing the two neighbourhoods, both in terms

of geometry and plan information (Figure 1), we end up with the following data layers in the GIS:

Page 7: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

CITTA 3rd

Annual Conference on Planning Research Bringing City Form Back Into Planning

7

Building: any built-up object, both public and private.

Open space: empty space within blocks, both public and private.

Plot: the legal boundary of a property, containing buildings and open space

Block: group of plots and private or public open space, forming an island surrounded by the

transport network.

Pavement: the public space between the blocks and the roads.

Road centre line: linear representation of the street network.

Figure 1. Plan of the case study areas in the GIS

6 Analysis

In the analysis phase the plan is analysed spatially to extract further attributes and statistically to

evaluate the importance and relation between attributes. We then perform k-means clustering on

the resulting data set to identify typologies of streets and urban blocks.

This is where the statistical data mining occurs, which can be defined in the following terms:

Concepts – Block and street.

Instances – The collection of blocks and streets from the case study.

Attributes – The selected meaningful properties of blocks and streets, both numeric and

nominal.

Page 8: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

Author 1, Author 2 (First name initial and last name) Title of paper

8

6.1 Spatial analysis

A GIS is used to perform spatial analysis operations involving geometric and network calculations

on the axial map, as well as simple data filtering and mathematical calculations, to obtain all the

required attributes (Table 1).

Once this is completed it is important to visualise the individual attributes of blocks and streets by

mapping them as it helps the verification of representation mistakes, inconsistencies in the

calculations and is also a first step in becoming familiar with the data (Figure 2).

Figure 2. Maps of block area (a) and block FAR (b)

6.2 Descriptive statistics

At this stage, we explore the data through descriptive statistics, data distributions, and perform the

cleaning of errors, understand outliers or missing values. Because most urban spatial attributes do

not have a normal distribution, we transform those using log(x), as a normal distribution is expected

in most statistical operations.

Next, performing pair-wise correlation can help identify and exclude dependent attributes, which

would bias the study towards a specific theme of classification. For example, in our case study we

found a strong correlation between block area and the other dimensions of length, width and

perimeter, which are excluded from the classification process, but are accounted for in other ratios

like proportion and compactness.

6.3 Clustering

Clustering allows the classification of numeric and nominal attributes in multi-dimensional space

where there are no classes beforehand and their number (k) is not known. We apply a classic k-

means clustering technique, which is suitable for a small database (in data mining terms) with many

outliers, and also gives us the distance of every instance to the centroid of the cluster, which allows

us to select the k-medoid (archetype) of each cluster. Other clustering techniques can be used

depending on the data set.

Page 9: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

CITTA 3rd

Annual Conference on Planning Research Bringing City Form Back Into Planning

9

We produce various sized (k value) sets of clusters of block and street typologies according to the

notable points in the scree plot function (Figure 3) that uses the sum of squared distances in all

clusters. The optimum should be where the plot shows a kink; when it flattens more clusters provide

more detail with lower information gain.

Figure 3. Scree plot of block (a) and street (b) clusters, where circled are the selected k numbers

for testing

We visualise each cluster by mapping its elements on the plan (Figure 4), to observe if there are

any clear typologies or known classes being identified. We observe that some cluster sets look like

transitions in the classification not giving obvious typologies, but there are points that produce clear

separations of increasing detail. For blocks we identify 4, 6 and 12 clusters and for streets 4, 8, 10

and 14 as such demarcation points.

Figure 4. Mapping of (a) six block and (b) four street clusters

7. Description

Page 10: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

Author 1, Author 2 (First name initial and last name) Title of paper

10

The description phase translates the results into a format that is more useful for the urban design

process, which includes a semantic definition of the emergent clusters of urban typologies. In

addition, the cluster attributes, e.g. length, width, number of floors or density, can serve as inputs

for a parametric rule-based urban design process.

7.1 Statistical profiling of clusters

The block and street attributes, which in most cases are continuous numeric values, e.g. area or

length, should be classified to facilitate the description process. Data discretisation can be achieved

using:

quantiles

normal quantiles

equal intervals

natural breaks

domain knowledge classes when these are known

Ideally there are domain knowledge standard classes, which are more useful in practice because

they are meaningful beyond the data pattern, but in this case we have used quartiles. We then

profile each cluster according to the share that it has of each of the attributes (Figure 5).

Figure 5. Sample of the categorisation charts: profile of street clusters 1 and 2

7.2 Semantic description of typologies

For the semantic description of the clusters we focus only on the attributes that have dominant and

unique characteristics, in order to highlight the specificities of each typology instead of the

generalities common to both neighbourhoods. Dominant characteristics have a share of a class

above 70% within that attribute, e.g. 94% of blocks in cluster 3 have an area of public space

classified as very low. Unique characteristics have a share of a class 50% above or below the

average of that class in other clusters, e.g. cluster two has 6% of blocks with the lowest open space

ratio, which is 50% below the average of all other clusters. We present a succinct description of the

six block and four street cluster sets (Table 2), together with a sample of the “archetype” block and

street entities (Figure 6).

Table 2. Sample of block and street typology descriptions

Cluster Description

Block 1 Closed block, medium density with private courtyard only

Block 2 High density, compactness and pressure on open space

Page 11: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

CITTA 3rd

Annual Conference on Planning Research Bringing City Form Back Into Planning

11

Block 3 Low density with private open space

Block 4 Open block of medium density with privileged public space

Block 5 Open public space with no built area

Block 6 Large, low density block with equipment and associated public space

Street 1 Very low or no continuity and movement flow

Street 2 High connectivity and continuity streets

Street 3 Low continuity streets

Street 4 Long streets with wide pavements and high average of tall buildings

Figure 6. Sample of block (a) and street (b) typology archetypes for the clusters in Table 2.

7.3 Design rule constraints

Another output of this process is a table with the quantitative description of each cluster, in terms of

minimum, maximum, average and standard deviation values of every attribute used, which can

provide useful input to parametric rule-based design.

8 Results

The results of applying the data mining methodology to the case study are encouraging. By

statistically correlating the instances in the clusters to their pre-defined neighbourhood, Expo 98 or

Moscavide, we observe the degree to which the clusters are characteristic to a neighbourhood. We

find that some clusters clearly correspond to one of the areas, eventually with few outliers indicating

the odd instances of that area (Table 3). The overall R2 between clusters and neighbourhoods is

0.67 for the block clusters and 0.58 for the street clusters, where a value of 1 would correspond to

complete identity between the two variables.

The clusters with an even share of instances from both areas, e.g. “Block 3” and “Street 3”, tend to

get subdivided when the number of clusters is increased demonstrated by an R2 of 0.8 between

four and six block clusters, and 0.87 between four and ten street clusters (Table 3).

Table 3. The percent share of instances from the two neighbourhoods in each cluster.

Cluster Size Expo 98 Moscavide

Block 1 45 0.00 100.00

Block 2 16 93.75 6.25

Block 3 17 52.94 47.06

Block 4 22 90.91 9.09

Block 5 2 50.00 50.00

Block 6 2 100.00 0.00

Street 1 14 28.57 71.43

Street 2 96 3.13 96.88

Street 3 44 63.64 36.36

Page 12: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

Author 1, Author 2 (First name initial and last name) Title of paper

12

Street 4 66 95.45 4.55

Visual inspection of the cluster instances on the plan (Figure 3) confirms the consistency of the

statistics and demonstrates the extent of typological overlap between areas. Some blocks in

Moscavide are more recent and correspond to the typologies found in the Expo 98 site, and some

street typologies, such as the dead end, are universal and can be found on both areas. Only further

clustering of those groups would eventually identify types of dead end unique to one area or the

other.

9 Discussion

The emergent classification helped us to identify a series of block and street typologies at various

levels of detail that at expert inspection correspond to known typologies and highlight the different

character of the two neighbourhoods. In doing so, this clearly moves away from the one-

dimensional classification on a theme criticised by Marshall (2004) and picks out instances that are

typical according to different themes, an essential aspect of a less fragmented vision of complex

urban environments.

9.1 Methodological issues

However, this context analysis methodology can be demanding in the preparation phase if an

adequate GIS database is not available and requires a high level of GIS and statistics expertise in

the analysis phase.

Usually there aren‟t GIS vector data sets available for urban areas with the architectural level of

detail required for urban morphology studies, which go down to the building scale, although these

are becoming more common with the introduction of GIS in local authorities, the upgrade of national

data bases, as is the case with the UK Ordnance Survey‟s “MasterMap” data set, or the increasing

availability of public domain GIS data, for example OpenStreetmap (http://www.openstreetmap.org/:

Accessed April 2010). For the required level of detail one often has to resort to CAD drawings and

to extra survey or plan information in text format to complement it.

There are many operations needed to convert a CAD drawing into a GIS urban topology for

analysis, partly due to inconsistencies in the original plan representation, including defining the

entities‟ geometry clearly and grouping each type in separate layers. Even the definition of the

entities‟ topology in certain urban areas, such as the unclear boundary between street, public space

and private space can be problematic

Furthermore, there needs to be clear agreement on the attributes‟ selection and how they should be

calculated, otherwise one risks ending with typologies that are of little use for urban regulations or

design operations. For a complete appraisal of this matter we would need a more complete data set

with attributes for other types of indicators relating to socio-economic conditions, in particular land

use, population size and type.

Still this methodology offers insights into the complex nature of urban environments that could not

be obtained manually, by identifying multivariate typologies within large numbers of instances. Also,

by using systematic statistical operations the process can be consistently applied to different sites

by different people, avoiding much of the subjective nature of urban environment profiling.

Page 13: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

CITTA 3rd

Annual Conference on Planning Research Bringing City Form Back Into Planning

13

9.2 Further work

With such a methodology we offer a process of producing a context sensitive sample of typologies,

which can be used as precedents for an urban design program. Furthermore, when integrated in a

parametric rule-based urban design system, it provides useful quantitative rule constraints that

direct the system towards solutions that fit to the context or to an urban program of sustainable

development. Both of these scenarios need to be formally tested by using these outcomes in a

formulation and generation process.

Two other topics deserve consideration. In this example we assume that all attributes are of equal

importance, but maybe using domain knowledge one could assign weights to the attributes to better

address a specific problem. The danger is to bias the clustering results towards some

misconception, reducing the emergent quality of the process.

On the other hand, by not filtering attributes for the site or for the specific problem, we would obtain

more universal results and could consider building a catalogue of precedents. But would it be

actually useful for design practice? Many criticisms exist of standard catalogues and design codes.

Ultimately, by following this data mining process, we are acknowledging that urban interventions

deal with complexity and are problem and site specific.

Finally, further work using a similar approach involving more detailed information on other aspects

of the public space and their relation with the blocks, for instance, issues of ground floor

transparency, land use and permeability may me able to give a deeper insight towards the

understanding of urban typologies and neighbourhoods, and inform the development of new urban

patterns.

10 Conclusion

In this paper we presented a data mining methodology that applied to urban environment data sets

is capable of identifying typologies from the site context during the pre-design phase and is useful

in defining values for parametric rule-based design. In doing so it addresses the complex multi-

scale and multi-level nature of urban environments binding qualitative and quantitative

requirements.

While descriptive statistics facilitate the general description of a neighbourhood as a whole,

according to socio-economic, morphologic and network layout information, using data mining

techniques it becomes possible to classify the elements of those neighbourhoods according to

multiple qualitative and quantitative attributes simultaneously. This provides a more detailed

profiling of the character of a neighbourhood, which facilitates the understanding of the site context

or of a plan‟s design rules and constraints.

References

Alexander, C., Ishikawa, S. and Silverstein, M. (1977), A pattern language, Oxford University Press.

Berghauser Pont, M. and Haupt, P. (2004), Spacemate: the spatial logic of urban density, Delft University Press

Science, Delft.

Page 14: On the Discovery of Urban Typologies: Data Mining the Multi dimensional Character of Neighbourhoods

Author 1, Author 2 (First name initial and last name) Title of paper

14

Fayyad, U.M., Piatetsky-Shapiro, G. & Smyth, P. (1996), “From data mining to knowledge discovery: an

overview”, in Advances in knowledge discovery and data mining, American Association for Artificial

Intelligence: 1-34.

Hanna, S. (2007), “Automated Representation of Style by Feature Space Archetypes: Distinguishing Spatial

Styles from Generative Rule”, International Journal of Architectural Computing, 5(1) 2-23.

Higueras, E., (2006), Urbanismo bioclimático, Gustavo Gili, Barcelona.

Hillier, B. and Iida, S., (2005), “Network effects and psychological effects: a theory of urban movement”, in

Proceedings of the 5th International Symposium on Space Syntax, TU Delft, Delft: 553-564.

Laskari, A. (2007), Urban identity through quantifiable spatial attributes: Coherence and dispersion of local

identity through the comparative analysis of building block plans, MSc Thesis, UCL, London.

Lynch, K. (1960), The image of the city, MIT Press.

Marshall, S. (2004), Streets and Patterns: The Structure of Urban Geometry, Routledge, London.

Montenegro, N., Duarte, J. P. (2008), “Towards a Computational Description of Urban Patterns: An urban

formulation ontology”, in Proceedings of the 26th Conference on Education of Computer Aided Architectural

Design in Europe, Antwerpen.

Rapoport, A. (1990), History and Precedent in Environmental Design, Springer.

Reffat, R. (2008), “Investigating Patterns of Contemporary Architecture using Data Mining Techniques”, in

Proceedings of the 26th Conference on Education of Computer Aided Architectural Design in Europe,

Antwerpen: 601-608.

Urhahn, G. and Bobic, M. (1994), A pattern image: a typological tool for quality in urban planning, Thoth

Publishers, Bussum, The Netherlands.

Witten, I.H. & Frank, E. (2005), Data Mining: Practical Machine Learning Tools and Techniques, Second Edition,

Morgan Kaufmann Publishers Inc, San Francisco.

Acknowledgements

The City Induction research project is supported by Fundação para a Ciência e Tecnologia (FCT), Portugal,

hosted by ICIST at the Technical University of Lisbon (PTDC/AUR/64384/2006) and coordinated by Prof. José

Pinto Duarte. J. Gil, N. Montenegro and J.N. Beirão are responsible for the evaluation, formulation and

generation modules, respectively.

J. Gil is funded by FCT with grant SFRH/BD/46709/2008. N. Montenegro is funded by FCT with grant

SFRH/BD/45520/2008. J.N. Beirão is funded by FCT with grant SFRH/BD/39034/2007.

The spatial network of the study area and its surroundings was taken from the “axial map” of the Lisbon region

with permission from its author, João Pinelo, University College London (UCL).

The following academic software was used in the described methodology: Confeego 2.0 from Space Syntax

Limited and UCL Depthmap 8.15 by Alasdair Turner, UCL.