Top Banner
Government 3.0 The Tools: Big Data and Open Data 1 Michael Holland February 27, 2013
25
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Michael holland ppt

Government 3.0The Tools: Big Data and Open Data

1

Michael HollandFebruary 27, 2013

Page 2: Michael holland ppt

The CUSP Partnership• The University Partners:

– NYU, NYU-Poly, Univ. of Toronto, Warwick University, CUNY, IIT-Bombay, Carnegie Mellon University,

• The Industrial Partners:– IBM, Cisco, Xerox, ConEdison, [Lutron,] National

Grid, Siemens, ARUP, IDEO, AECOM

• City and State Agency Partners:– NYC Agencies, MTA, Port Authority

• National Laboratories:– [Lawrence Livermore National Laboratory, Los

Alamos National Laboratory, Sandia National Laboratories, Brookhaven National Laboratory]

A diverse set of other organizations have expressed interest in joining the partnership

2

Page 3: Michael holland ppt

Big data can be brought to bear on societal issues

• Sensing/transmission/storage/analysis capabilities growing rapidly

• How can you “instrument society”?• What do you want to know?• How can you find out?• What could you do with the

information?– Descriptive, predictive

• Greenhouse Gas Treaty Verification methodology is an example of this• Fuse surveys, direct measurements,

proxies to independently verify GHG emissions

Page 4: Michael holland ppt

Properly acquired, integrated, and analyzed, data can •Take government beyond imperfect understanding

– Better (and more efficient) operations, better planning, better policy•Improve governance and citizen engagement•Enable the private sector to develop new services for governments, firms, citizens •Enable a revolution in the social sciences

Environment

Meteorology, pollution, noise, flora, fauna

People

Relationships, location, economic /communications activities, health, nutrition, opinions, …

Infrastructure

Condition, operations

What does it mean to instrument a city?

Page 5: Michael holland ppt

• Organic data flows– Administrative records (census, permits, …)– Transactions (sales, communications, …)– Operational (traffic, transit, utilities, health system, …)

• Sensors– Personal (location, activity, physiological)– Fixed in situ sensors– Crowd sourcing (mobile phones, …)– Choke points (people, vehicles)

• Opportunities for “novel” sensor technologies– Visible, infrared and spectral imagery– RADAR, LIDAR– Gravity and magnetic – Seismic, acoustic– Ionizing radiation, biological, chemical– …

Urban Data Sources

Page 6: Michael holland ppt

311 Noise Report Density

Page 7: Michael holland ppt

Source EUI, Multi-Family Buildings

02

46

810

Perc

ent

0 100 200 300 400 500Current Weather Normalized Source Energy Intensity (kBtu/Sq. Ft.)

D. Hsu and C. Kontokosta, NYC Local Law 84 Benchmarking Report, 2012

Source EUI, Office Buildings

Building Energy Use

Page 8: Michael holland ppt

• 300 million mobile phones; 494,151 cell towers• Approximately 400,000 ATMs record video of all

transactions• 30 million commercial surveillance cameras • 4,214 red-light cameras; 761 speed-trap cameras• A third of large police forces equip patrol cars with

automatic license plate-readers that can check 1,000 plates per minute

Source: Wall Street Journal (January 3, 2013) – “In Privacy Wars, It’s iSpy vs. gSpy”

Some Sensor Stats: United States

Page 9: Michael holland ppt

Drop-off

Pick-up

Most drop-off’s occur Most drop-off’s occur on the avenues, most on the avenues, most pick-up’s on the streets pick-up’s on the streets

Lauro Lins, Fernando Chirigati, Nivan Ferreira,Claudio Silva and Juliana Freire - NY- Poly(Data obtained from TLC on June 6th, 2012)

9

Visualization of TLC GPS Data

Page 10: Michael holland ppt

May 1st – 7th 2011

3.6 Million Trips

Train Stations

Airports

Studying Taxi Patterns

Page 11: Michael holland ppt

Wang, P., Hunter, T., Bayen, A.M., Schechtner, K. & Gonzalez, M.C. Understanding Road Usage Patterns in Urban Areas. Nature, Sci. Rep. 2, 1001; DOI:10.1038/srep01001(2012).

Cell Tower Records for Traffic Analysis

Page 12: Michael holland ppt

Urban Observatory• Provisioned urban vantage point(s)

– MetroTech (1 MT and 388 Bridge St)– 277 Park Ave (at 47th Street)– Governor's Island

• Suite of bore-sighted instruments– Photometric and colorimetric optical imaging – Broad-band IR imaging (SWIR, MWIR, and thermal?)– Hyperspectral imaging (trace gases)– LIDAR (building motions, pollution)– Radar (building /street vibrations, building motion, traffic flow)

• Correlative data on the urban scenes– Meteorology (temperature, winds, visibility)– Scene geometry (distances, directions, identities of features visible)– Parcel and land use data, building characteristics and activities,

building utility consumptions, and real estate valuation data– In situ pollution data and location/nature of major sources– In situ vehicle and pedestrian traffic for the streets visible– Demographic and economic data

• Capability to archive, process, and analyze data acquired– Image processing chains– Data warehouse, GIS, Visualization tools– Software and procedures to enhance privacy protection

• Personnel and funding to create and operate the above

Page 13: Michael holland ppt

Looking South from the Empire State Building

Page 14: Michael holland ppt

Manhattan in the Thermal IR

Photo by Tyrone Turner/National Geographic

Other synoptic modalities: Hyperspectral, RADAR, LIDAR, Gravity, Magnetic, …

199 Water StreetBuilt 1993 :: 998,000 sq ft

electricity, natural gas, steamLEED Certified

Page 15: Michael holland ppt

Quantified Community

• Fully instrument a slice of the city– 10-100k people within 20 blocks of MetroTech or

a new development– Create a well-characterized test bed for

technologies/policies and behavioral interventions

• What constitutes “complete instrumentation”?– In situ vs. choke points vs. synoptic?– Acoustic/traffic/mobile

phones/video/IR/magnetic/CBRN/…– Economic data? Physiological data? Nutrition? …

15

• How to fully engage people who live/work in the community to provide data, participate in citizen science, create educational opportunities, …?– Foster improved quality of life: “cleanest/greenest/healthiest/most livable /…”– “I’ll show you the parking spaces …”– ???

• What might we expect to learn?

Page 16: Michael holland ppt

• Optimize operations– traffic flow, utility loads, services delivery, …

• Monitor infrastructure conditions– bridges, potholes, leaks, …

• Infrastructure planning – zoning, public transit, utilities

• Improve regulatory compliance (“nudges”, efficient enforcement)• Public health

– Nutrition, epidemiology, environmental impacts

• Abnormal conditions– Hazard detection, emergency management

• Data-driven formulation of data-driven policies and investments– Road pricing and congestion charging, time-of-day power, …)

• Better inform the citizenry• Enhance economic performance and competitiveness

What can cities do with the data?

Page 17: Michael holland ppt

Among the projects we’re considering

• Normalization, interoperability of city data sets• 3D Urban GIS capability• Multi-data correlations to improve city resource

allocation • Noise / Temperature / Pollution• Mobility• Novel sensing of public health• Building efficiency• Living Lab definition

17

Page 18: Michael holland ppt

Privacy Issues

• Privacy issues are structural - you can’t study society without studying people at some level

• People will voluntarily give up their data if they can see a personal or societal benefit– Social networks, voltstats.net, …

• Norms/expectations are changing with generations• There are technical fixes for multi-level

privacy/classification• Privacy is eroding in any event and we should do our

best to ensure it is done sensibly• We don’t yet know what the optimal level of privacy is

for studies of interest

18

Page 19: Michael holland ppt

An Ex-Oversight Staffer’s Opinions about

“Data” in an Agency Context

Page 20: Michael holland ppt

Research Program(Competitive)

Agency (Corporate)

Political (Macro)

Society

Disciplines

Societal Demands

DefenseEnergyEconomic SecurityHealthEnvironmentFood/WaterDiscovery

VALUE

Scientific Opportunities

AMO, bio, nano, NP, EPP, Astro

cosmology

MERIT

Context, Context, Context

Page 21: Michael holland ppt

One Systematic Evaluation Process:OMB/OSTP R&D Investment Criteria

Quality Relevance Performance

Prospective

[1] Mechanism of Award (e.g., 10 CFR 605)

[2] Justification of funding distribution among classes of performers

Planning & Prioritization:

Strategy

“Top N” Milestones

(5 < N < 10)

Retrospective

[1] Expert reviews of successes and failures

[2] Information on major awards

Evaluation of utility of R&D results to both field and broader “users”

Report on

“Top N” Milestones

GPRA-style Annual Metrics

Advisory Committees & NAS

Page 22: Michael holland ppt
Page 23: Michael holland ppt

Roles of “Data”

• Scientific Understanding: Data improves unbiased explanation of natural or social phenomena

• Administrative Action: Data ensures that Agencies transparently exercise their delegated authorities in a fashion that is not "arbitrary and capricious, an abuse of discretion, or otherwise not in accordance with the law."

• Legal or Political Action: Data as a tool for adjudicating disputes, i.e., winning contests and seeing one’s priorities implemented.

Page 24: Michael holland ppt

Is USG Robust Against “Big Data?”

[T]he median Congressional district is now about five points Republican-leaning relative to the country as a whole. Why this asymmetry? It’s partly because Republicans created boundaries efficiently in redistricting and partly because the most Democratic districts in the country, like those in urban portions of New York or Chicago, are even more Democratic than the reddest districts of the country are Republican, meaning there are fewer Democratic voters remaining to distribute to swing districts.

“As Swing Districts Dwindle, Can a Divided House Stand?” Nate Silver, NYT, Dec 27, 2012

Page 25: Michael holland ppt

Discussion

http://cusp.nyu.edu/ NYUCUSP

@NYU-CUSP