NSF Funding of LT resources
Tanya Korelsky, Program DirectorRobust Intelligence Cluster
Division of Information and Intelligent SystemsDirectorate for Computer and Information Science and Engineering
National Science Foundation
[email protected]://www.nsf.gov/
How NSF is organized
Biological Sciences
Computer and InformationSciences and Engineering
Education andHuman Resources
Engineering
Geosciences
Mathematical andPhysical Sciences
Social, BehavioralAnd Economic Sciences
Office of the Director
How CISE is organized
CCFComputing and
CommunicationsFoundations
CNSComputer and
NetworkSystems
IISInformation and
IntelligentSystems
Office of theAssistant Director
for CISE
OCIOffice of
Cyberinfra-structure
(formerly SCI, now with NSF-wide mission, reporting to
Director of NSF)
Office of the Director
Clusters ClustersClusters
Crosscutting Emphasis Areas
Funding Rate for Competitive Awards in CISE
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
Num
ber
of P
ropo
sals
and
Aw
ards
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Fund
ing
Rat
e
Competitive Proposal Actions Competitive Awards Funding Rate
CISE Proposal/Award Statistics
FY Proposals AwardsFunding
RateCGIs
Supple-ments
2005 4,962 1,086 23% 1,398 581
2004 6,266 1,017 16% 1,297 400
2003 5,346 1,174 22% 1,023 354
2002 4,314 1,038 24% 918 308
2001 3,579 885 25% 768 231
2000 2,853 903 32% 547 210
1999 2,209 746 34% 493 301
1998 1,885 667 35% 476 211
1997 1,894 684 36% 527 219
1996 1,760 601 34% 610 183
1995 1,941 708 36% 631 215
*ADJUSTED
CISE Budget: 2003-2007
475
500
2003 2004 2005 2006
Fiscal Year
Do
llars
in M
illio
ns
$496M
525$527M
2007Request
Requested 6.1%increase includes20M for cybersecurity,10M for GENI
The Human Language and Communication Program (HLC)
Initiated by Dr. Mary Harper This HLC program emphasizes innovative advances in computer
and information sciences relating to all forms of human communication.
High-level human communication topics: Text Processing Speech Processing Multimodal Communication Processing
HLC is attempting to strengthen current research while broadening future research directions of the language processing research community (e.g., multimodal communication).
HLC/ITR LT recent resource, annotation and evaluation metrics awards
ITR ’03: Collaborative effort on Interlingual Annotation HLC ’04: Constructing an Enhanced Version of WordNet, $100K
(12 months) HLC ’05:
Rapid Development of Frame Semantic lexicon, to ICSI, UC Berkeley, $400K (36 months)
SGER: Learning Syntax-based Evaluation Metrics for Machine Translation, Dr. Rebecca Hwa, University of Pittsburgh, $200K (24 months)
A Framework for Learning High Accuracy Evaluation Metrics for NLP Applications, Dr. Alon Lavie, CMU, $150K (24 months)
CISE CRI (Computing Research Infrastructure) Program
Funds community resources for IIS programs; reviewers are supplied by the technical program directors
’04 LT resource planning award: to Vassar College: An Open Linguistic Infrastructure for American English, $50K (12 month)
’05 LT resource/annotation awards: Towards a Comprehensive Linguistic Annotation of Language
(Brandeis, UColorado, Pitt, Penn, NYU), $850K, 24 months; goals include achieving an international consensus on a meta-specification framework
Another planning award ($100K) to Vassar College and Princeton University: An Open Linguistic infrastructure for American English; goals include annotation of semantic categories using WordNet and FrameNet
Information and Intelligent Systems Reorganization into Clusters
Robust Intelligence
Artificial Intelligence, Human Language and Communication, Robotics, Computer Vision, Computational Neuroscience
Human-centered Computing
Human Computer Interaction, Social Informatics, Universal Access
Information Integration and Informatics
Data, Information, and Knowledge Management; Information Integration; Science and Engineering Informatics; Digital Libraries; Digital Government
Information and Intelligent Systems
New Cluster-oriented Solicitation Scheduled to be published in May with submission deadline late
October – early November One of cross-cutting threads: Human-Robot Interaction Implications for HLC area - renewed attention to
dialogue (human-human, machine-human); ASR of imperfect and affected speech; Speech-to-concept understanding; concept-to-speech
generation Need corpora to support these research areas!
One Small Current Effort
SGER (Small Grant for Exploratory Research) Creation of a Goal-Oriented, Human-Machine Spoken
Corpus ICSI (UC Berkeley), Dr. Dillek Hakkani-Tur Building a spoken mixed-initiative dialogue system for
for conference services Deploying the system for the IEEE SLT Workshop
(December 2006) Collecting and annotating the dialogue corpus
Digital Tools Summit at Michigan State University (June 2006)
Funded jointly by the Linguistics Program and (former) HLC program
Addresses a functionality gap between the tools that documentary linguists and typologists need and the ability of existing tools to annotate partially-understood linguistic data
Existing methods and tools presuppose a regularized digital corpus of a well-understood language and require a high degree of computational sophistication
Aims to develop a roadmap for creating regional and national language archives and the tools to achieve it
Brings together theoretical computational linguists and “data-driven” linguists to brainstorm the challenging issues
NSF perspective on funding LT resources
New corpora for dialogue research New corpora for ASR research:
mixed language (English-Spanish) affected speech (911 calls); senior speech
New general corpora (ANC), both text and speech Dependency treebanks and parsers Harmonization of existing semantic resources (WordNet
and FrameNet) Basic research on semantic annotation: ambivalent
attitude to standardization
NSF perspective on funding LT resources (international resources)
Parallel corpora for new MT research on statistical methods applied to syntactic and semantic representations
Research on MT for minority languages (pending award to CMU for Inupiaq and Aymara)
Corpora for research on language identification International collaboration on speech processing (NYU-
EBIRE- CNRS) and on unified linguistic annotation International workshop on dependency representations
(2007 ACL in Prague)
Thank you
Tanya KorelskyRobust Intelligence
Human Language and Communication
Division of Information and Intelligent SystemsDirectorate for Computer and Information Science and Engineering
National Science Foundation
[email protected]://www.nsf.gov/
Digital Living 2010
People across the globe will have access to each other and information provided by pervasive devices, embedded sensors and systems because all will be connected to the Internet.
Home Computer
PDA
Telephone
Entertainment Systems
Car
Surveillance and Security(at home, work, or in public)
Building Automation
Banking and
Commerce
Photography
Home Appliances
Games
Inventory/Salestracking
Health/Medical
Communications
Thanks to David Kotz at Dartmouth
Global Environment for Networking Innovations (GENI)
Limitations of the Internet
Security mechanisms not included in the IP layer
End-to-end robustness cannot be assumed or assured
Scaling limitations
Quality of service mechanisms have not diffused widely in the public Internet
Support for new technologies difficult (e.g., wireless, mobility, sensors)
Global Environment for Networking Innovations
New networking and distributed system architectures
Build in security and robustness
Enabling pervasive computing, bridging the gap between the physical and virtual worlds by including mobile, wireless and sensor networks
Enable control and management of other critical infrastructures
Include ease of operation and usability
New classes of societal-level services and applications
Global Environment for Networking Innovations
Research Program
Supports research, design, and development of new networking and distributed systems
Builds on many years of knowledge and experience, but reexamine all networking assumptions and reinvent where needed
Design for intended capabilities; deploy and validate architectures; build new services and applications
Encourage users to participate in experimentation
Take a system-wide approach to the synthesis of new architectures
Global Environment for Networking Innovations
Facility Shared use through slicing and virtualization (where "slice"
denotes the subset of resources bound to a particular experiment)
Access to physical facilities through programmable platforms (e.g., via customized protocol stacks)
Large-scale user participation by "user opt-in" and IP tunnels Protection and collaboration among researchers by
controlled isolation and connection among slices A broad range of investigations using new classes of
platforms and networks, a variety of access circuits and technologies, and global control and management software
Interconnection of independent facilities via federated design.
Global Environment for Networking Innovations
Outreach
CISE has supported numerous community workshops in support of GENI
CISE is supporting on-going planning efforts, including needs assessment and requirements for the GENI Facility.
CISE will hold town meetings and continue to support future workshops to broaden community participation.
CISE will work with industry, other US agencies, and international groups to broaden participation in GENI beyond NSF and the US government.