Top Banner
••• 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies & Machine Translation [email protected] ICT 2008, Lyon, 26 Nov 08
50

1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

Mar 27, 2015

Download

Documents

Justin Newman
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 1

Roberto CencioniKimmo Rossi

Challenge 2 – Objective 2.2

Language based Interaction

DG Information Society and Media

Unit INFSO.E1Language Technologies& Machine Translation

[email protected]

ICT 2008, Lyon,

26 Nov 08

Page 2: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 2

Outline

• Opening remarks

• FP7 ICT Call 4 – Essence

• FP7 ICT Call 4 – Ingredients

• Q&A

• CIP ICT-PSP Call 3 – Opportunities

• Q&A, close

Page 3: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 3

Here we are

• a new unit established in July 2008– Language Technologies & Machine Translation (INFSO.E1)– high expectations vs. low rate of EC S&T activity in the last

few years

• language is everywhere– written & spoken; documents, messages, databases,

webpages, multimedia objects etc; information as well as meta-information

• but our resources are limited, so initial focus on– multilingual technologies, services, applications

• two instruments in 2009:– Research: FP7 ICT, call 4

Objective 2.2 – Language based Interaction– Innovation: CIP ICT-PSP, call 3

Theme 5 – Multilingual Web

• total budget of 40 Meuro

Page 4: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 4

• Why?

– new online paradigms centred around communication, collaboration, co-creation … but significant language barriers remain

– EU comprises 27 countries & 23 official languages

– single European Information Space – one of the i2010 objectives

– EC communication on Multilingualism (Sept ‘08) calls fora broader policy framework & joint action

• Purpose: support & enhance

interpersonal & business communication

information access & publishing

across languages

Baseline

Page 5: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 5

A few facts

• EU official languages: 23 x 22 = 506 pairs– EC MT (Systran core engine) has 18 pairs in operation

& 10 more pairs at prototype stage

– 60+ national, regional & minority languages within the EU

• English accounts for 30% of today’s Web content– 50% in 2000, 35% in 2004

– Arabic, Chinese, Portuguese … growing very fast

• nearly 1,5 billion internet users worldwide (2008)– c 320 million native EN speakers in the world

• basic requirements for the “digital translation market”:– volume– access– personalisation

real quick, real cheap

Page 6: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 6

Can’t this be doneelsewhere?

• indeed EU RTD projects often exhibit multilingual features

• yet approaches are too often naïve, short term, sectoral

• hence a dedicated focal point

– stimulating upstream research

– enhancing research capacity

– thus enabling more ambitious & impactful domain specific actions

Page 7: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 7

Research vs. Innovationdivision of labour

• from

– long term foundational research (FP7)

• through

– applied research & technology development (FP7)

• to

– integration & demonstration (FP7 + PSP)

– infrastructure & resources (FP7 + PSP)

• different scale of ambition (€)

• different level of maturity (technologyservice)

• different timescales & partnerships

Page 8: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 8

FP7-ICT Call

I. WorkprogrammeR&D topics & outcomes

Page 9: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 9

What technologycan offer today

• machine translation & translation memory

– making sense of online content

– improving productivity of human translation

– automatic translation of “acceptable” quality in specific domains / language pairs

• information search & retrieval

– find relevant information across languages

• information extraction, filtering, categorisation

– incl. summarization, routing & alert services, …

– for a variety of purposes eg business intelligence

• speech technology

– command & control, dictation systems

– call center services, conversational systems

Page 10: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 10

Trends

• new requirements, new approaches– from Web 1.X to Web 2.0 – we are all content producers

– from static & uni-directional to dynamic, volatile, collaborative

– from service to self-service, translations are needed “on the fly”

are language technologies up to the task?

• what happens to online content– disappearing document?

– Europa website: 6 million “documents”

– elusive distinction between content & service

how to manage effectively multilingual content

• multilingualism on the rise– in the EU (from 4 to 23 languages) and globally

– English gains ground but mother tongues remain

online content becomes even more multilingual

Page 11: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 11

What technology might offer tomorrow

• machine translation– MT that learns from its mistakes – embedded in products/services, can cover any use

context esp.online: chats, blogs, dynamic content ...– broader coverage, fill in missing languages

• information search & retrieval– truly multilingual access to information: query in any

language, content automatically translated

• website content development & management– new content is translated automatically– changes automatically applied in all language versions

• speech technology– real-time speech-to-speech translation (eg phone call,

in a conference)

Page 12: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 12

Challenges for MT

• bring MT to the users

– understand what users need

– novel use scenarios

– communication rather than translation

– better evaluation metrics

• MT that learns & adapts

– how to exploit feedback from users

– how to use readily available “world knowledge”

• towards a paradigm shift?

– inspiration from:

machine learning, cognitive systems, psycho-linguistics, sociology, semantic web, data mining, new computing paradigms ...

Page 13: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 13

a) Core research exploring new avenues for machine translation (IP)

ground breaking, multidisciplinary, high risk – high promise research

architectures & technologies that learn and adapt flexibly & effectivelyto different languages, domains & tasks

catering for new forms of language & communication (eg online communities; dynamic, volatile …)

b) Problem oriented research for specific tasks & usage contexts (STR)

online translation for the masses

translation in distributed collaborative environments

managing multilingual communication & content

automatic acquisition & annotation of language resources

c) Community building & networking (NOE)

reinvigorate European machine translation (MT) community

build bridges between MT & MLT and other relevant disciplines

help develop & coordinate shared technical infrastructure, promote reusability & interoperability, foster evaluation

FP7-ICT Call 4 at a glance

Page 14: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 14

Core research. Explore new research avenues (one IP, up to 8 M)– break new ground, foster a novel multi-disciplinary approach to

machine translation

– architectures & technologies that can learn and adapt flexibly& effectively to different languages, domains & tasks

– catering for new forms of language & communication (eg online communities)

– high risk but high promise (accuracy, speed, scalability)

– language & translation models coupled with data driven, machine learning methods

automatic acquisition & representation of linguistic facts

semantics, models of world knowledge relevant for translation

approaches inspired from social networks …

Outcome a)IP

Page 15: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 15

Outcome b)STR

Problem oriented research. A clearly defined usage context (~5 STR’s, c 12 M)– online translation for the masses

wide coverage (beyond GoogleTranslate); adequate quality, suitableat least for gisting/browsing; language embedded in documents, web pages, multimedia objects …

– translation in distributed environments support non-linear collaborative interplay between authors, translators,

editors/publishers & active users; innovative integration of automatic, interactive & human translation beyond current practice; technologies as well as processes & social interaction

– managing multilingual content & communication a superset of the above addressing the development & management

of online content & services esp. their versioning & maintenance in multiple languages

– acquisition & annotation of language resources (nearly-)automatic, high volume, high performance mining the web as well available repositories (eg corpora) and

public information sources

Page 16: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 16

Outcome b)managing multilingual

Web content

• methods, techniques, metrics … for developing & managing multilingual web content & services

– much more than translation; significant cultural elements

• think of

– one big website in many languages, or

– several interrelated websites, one country/language each

• now think of how to maintain the integrity & consistency of such resources, effectively & over a long period of time

– and how to detect & repair gaps or inconsistencies

• so, beyond the “translation” step:

– design, authoring, versioning & maintenance of (multiple, parallel, interconnected …) websites, portals or repositories

– in a distributed collaborative environment, possibly across organisational boundaries

Page 17: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 17

Outcome c)NOE

Community building & networking (1 or 2 NoEs, up to 6 M)– reinvigorate Europe’s machine translation (MT) community

bring together key players from scientific, technical & commercial circles (esp. SMEs)

stimulate cross-border cooperation (teams, institutions, national initiatives)

assess skills, foster training & exchanges; support smaller teams & not well-served languages

identify gaps, establish roadmap encompassing technologies, resources & applications

– build bridges between MT & MLT community and other relevant disciplines stimulate dialogue between diverse communities; identify opportunities & bottlenecks

initiate integrative research, prepare the ground for further collaboration

explore medium to long term approaches, identify possible shifts in paradigm

– develop & coordinate shared technical infrastructure, reusability & interoperability, evaluation infrastructural support: portal services, inventories & repositories of general

interest tools & raw/annotated datasets, their documentation

active promotion of reusability & open-source; harmonisation of representation & annotation schemes

foster widely recognized benchmarks ...

Page 18: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 18

What we don’t do

Not supported under Call 4:

• approaches that do not promise to deliver performance along with portability, scalability & maintenability

– yes: emphasis on automation, flexibility & cost effectiveness

• developments addressing immediate commercial concerns

– no: adding a language pair to an existing product

• proposals that do not address « language transfer »

– yes: focus on mapping a source language into one or several target languages

• issues covered by other Challenges and Objectives

– no: HMI, interaction with robots, ambient intelligence …

• topics well covered by recent & ongoing projects

– no: sign languages, dialogue systems …

Page 19: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 19

Practical info

FP7-ICT Objective 2.2 – Language based interation

budget: 26 Meuro under Call 4

managed by: Unit E1

Email: [email protected]

EC contact: Mr Kimmo Rossi

• inquiries: available

• pre-proposals: from Dec 1st until 3 weeks before the call closing date (Apr 1st)

Language Technology Days: 14-15 January 2009, Luxbg

ICT Proposers’ Day: 22 January 2009, Budapest

Page 20: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 20

Web sources

INFSO.E1 website (under construction):

cordis.europa.eu/fp7/ict/language-technologies/..

• FP7-ICT: ../fp7-call4_en.html

• ICT-PSP: ../cip-psp_en.html

– Events & Presentations

– Call guidance notes

– Background material & useful Links …

EC contact: Mrs Susan Fraser

Page 21: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 21

FP7-ICT Call

II. Practicalities &Success Factors

Page 22: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 22

LT Days

14-15 January, 2009

Luxembourg, JMO conference complex

EC presentations, sessions w/ext speakers, proposal clinics, self-presentations & posters

Agenda & registrations:

cordis.europa.eu/fp7/ict/

language-technologies/fp7-call4_en.html

Page 23: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 23

Pre-proposals& Clinics

3 pages max, mail to: [email protected] • describe the problem your proposal addresses, in particular

– specify the intended user profile and related tasks

– describe actual or prospective applications

– detail data sets: source(s), typology, volume

• how will the proposed project contribute to the outcomes and impacts set out in the work programme? – what are the key innovations?

– what will be the main concrete results?

– what public outputs are foreseen?

– what impact do you expect?

• describe the consortium – give partners' names or profiles and the intended skills mix

– indicate the intended instrument (if known)

• indicate the scale of your ambition

– what is the estimated effort (man-months)

– how long will the proposed project last?

– what amount of EU funding are you looking for?

Page 24: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 24

Overall approach

• research for a purpose, problem driven

• centred around people & tasks, data & flows

– a compelling use case is as important as the underlying research

• meaningful demonstrator(s)

– field validation & assessment

• active promotion & dissemination of results beyond purely scientific circles

– public outputs, public final showcase

Page 25: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 25

Instruments

• IP

– up to 4 years, 5-8 Meuro (EU funding)

• NoE

– up to 3 years, 3-6 Meuro

• STR

– up to 3 years, 2-3 Meuro

Page 26: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 26

Partnerships

• keep the consortium manageable:

IPs 7-11 partners

STRs 5-7 partners

NoEs 3-4 “core” partners

• select competent, committed & reliable partners; geography not an issue!

• industry, SME, academia … participation as dictated by project needs

• user/industrial/commercial organisations to provide a demanding problem & validation context

Page 27: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 27

Language coverage

• most of the work is expected to be language independent– flexibility & ease of adaptation to other languages are indeed key

factors

– many of the ancillary tasks & tools are language independent anyway

• project outcomes must however be validated in 3+ languages– preferably belonging to different linguistic families

• target languages are chosen & justified by the proposers bearing in mind the following priorities (from high to low):1. EU official languages

2. nationally recognised languages

3. regional languages

4. minority languages

• Non-EU world languages linked to global markets & exports can be considered as well– on a proposal by proposal basis

Page 28: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 28

Target industrialsectors

• look for– huge & growing data volumes – competitive pressure– high growth & innovation– international markets

• obvious candidates– ICT & media– manufacturing– process industries eg pharmaceuticals– energy & utilities– engineering & construction– financial services …

Page 29: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 29

• RTD content– narrow scope, little or no EU dimension

– lack of focus, aims too general

– lack of innovation, current state of art missing

• planning– links missing between objectives & work plan

– milestones missing or too general

– risk factors not addressed, no contingency plans

– no monitorable indicators, no metrics

• management– consortium not balanced, gaps in the skills mix

– lack of integration between partners

– vague management structure

– weak or narrow dissemination plans

– ill-defined exploitation prospects

Reasons for failure

Page 30: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 30

• Quality

• Impact

• Effectiveness

but also

• Relevance wrt. WP

• Credibility

Evaluators will have access to Web sources: previous projects, teams & skills, background & reference documents …

Success factors .1

Page 31: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 31

It’s a project, not a dissertation:

– problem?

– user?

– data?

– outputs (incl. public ones)?

– metrics?

– impact?

– exploitation channels?

– …

Success factors .2

Page 32: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 32

Success factors .3

• preserve your credibility: select one proposal & make it win

• ensure that the proposal brings out both innovation & exploitation potential

• full depth of participation rather than long list of organisations with limited involvement

• key individuals, expertise & achievements rather than long list of previous projects

• make the proposal compelling for a busy reader (the first 5-10 pages are key!)

Page 33: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 33

Time schedule

• call due to close 1 April, 2009

• evaluation & selection until end June

• negotiation from mid-July on

• contract awarding in December

• projects due to start Q1 2010

… highly selective & demanding process

Page 34: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 34

ICT-PSP Call

Overview(subject to forthcoming adoption of WP,

call budget & schedule)

Page 35: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 35

ICT-PSP Call 3,Q1 09

ICT Policy Support Programme (PSP) within the Competitiveness & Innovation Framework Programme (CIP) (adopted in October 2006)

• geared towards innovation & ICT uptake:

– development of the Single European information space

– strengthening of the internal market for ICT products and services and ICT-based products and services

– stimulation of innovation through the wider adoption of and investment in ICT

• ensure seamless access to ICT-based services

• improve the conditions for the development of digital content, taking into account multilingualism & cultural diversity

Takes over eContentplus activities from Jan 2009

Page 36: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 36

• translation & interpretation market (exc. in-house):– c $15 billion; €1.1 billion for EU institutions alone (2006)– top EU-based translation company posted a revenue of

$175 million in 2006• market fragmentation

– big players < 1000 employees– est. 300,000 full time salaried translators worldwide

(37% in Europe)• a good European base

– SDL, Star, RWS, XRX, Euroscript, Logos, Moravia, VistaTEC, Semantix …

– ESTeam, Lucy Software … • a largely untapped potential

– 4x according to some companies

“Europe’s language is Translation”

Page 37: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 37

Business world

• new models: Most companies follow the age-old translate-edit-proofread model of translation. Collaborative, web-based technologies allow translation to become more agile, faster, and better with fewer steps (CSA Inc.)

• new markets: Language Weaver is entering the three new strategic markets – Web Content, Business Intelligence and Customer Care – to provide high-volume, high-speed, and accurate automated translation solutions at a price that would have been unfathomable just a few years ago

• new approaches: If you don't see your native language here, you can help Google create it by becoming a volunteer translator. Check out our Google in Your Language program

• and then of course:

Unfortunately for Google as a person with 7 years of translation experience myself I can tell that you will hardly ever find a translatorwho will agree that machine translation can be useful for anything. (a Russian translator)

Page 38: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 38

ICT-PSP Call 3,Theme 5:

Multilingual Web

• 3 objectives:

– machine translation for the multilingual Web (pilot projects)

– multilingual Web content management (pilot projects)

– standards & best practices for the multilingual Web (thematic network)

• 14 Meuro in total, around 6 projects

“The duration of the pilot is expected to be 24 to 36 months within which there should be a 12-month operational phase.”

Page 39: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 39

ICT-PSP Call 3,Theme 5:

Multilingual Web

• research: no, at least not ICT research …

• development/engineering:

– optimisation, customisation, integration … of existing (state of the art) methods, tools & services with a view to defining new approaches, offerings & practices

• demonstration:

– innovative combination is key; new business models, processes & services, organisational setups, usability …

– evaluation along user, technical & (socio-)economic dimensions

• problem orientation:

– useful & useable although possibly not perfect; think ROI

Page 40: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 40

Scope & defs

• MT as defined in the ICT-PSP workprogramme encompasses

1. fully automatic machine translation, whatever the technology

2. interactive computer-aided translation (eg TM)

3. a suitable combination of 1. and/or 2. with web based

– human translation, proof-reading & post-editingincl. where relevant methods inspired from social networks

– workflow & content management systems, …

• innovative & effective combination of people, processes& technology; the end result is not science, rather

– more and/or better output

– save time

– cut cost

• emphasis on language transfer, from source language to target language(s)

– language input-output (e.g. speech-to-text) is not the focus

– cross-platform, multi-format content access/delivery is key

Page 41: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 41

Language coverage

• some of the work is expected to be language independent– flexibility & ease of adaptation to other languages are key factors

– content authoring & management, collaboration & workflow … are language independent anyway

• project outcomes must be validated in 3+ languages– preferably belonging to different linguistic families

• target languages are chosen & justified by the proposers bearing in mind the following priorities (from high to low):1. EU official languages

2. nationally recognised languages

3. regional languages

4. minority languages

• Non-EU world languages linked to global markets & exports can be considered as well– on a proposal by proposal basis

Page 42: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 42

Cont’d

• project’s language coverage driven by the need to:– address gaps & overcome barriers e.g. cross-border

communication for less-developed languages, or

– exploit opportunities e.g. address emerging markets & sizeable language communities

• impact is key, so: viability, sustainability, exploitation channels, deployment prospects …

• main findings must be pro-actively disseminated

• some form of public showcase is mandatory

• participants should include– private or public sector content owners & aggregators

– providers of language services, technology suppliers

– (online) communities of interest where relevant

• 6-7 partners/project, up to €2.5 million funding, up to36 months

Page 43: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 43

ICT-PSP Call 3Feb 09

3 intertwined objectives:

5.1 machine translation for the multilingual Web (projects)

information access: MT and other multilingual solutions for information access & use, esp. cross-lingual search & retrieval

information publishing: MT to create, distribute and (re-)use more widely & effectively online content in a multilingual environment

5.3 multilingual Web content management (projects)

communication: multilingual Web content development & management; design, authoring, versioning & maintenanceof multilingual Web sites, portals or repositories

5.2 standards & best practices for the multilingual Web (network)

conventions & best practices for multilingual Web content

Page 44: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 44

ICT-PSP, 5.3multilingual Web

content management

• methods, techniques, metrics … for developing & managing multilingual web content & services– much more than translation; significant cultural elements

• think of– one big website in many languages, or– several interrelated websites, one country/language each

• now think of how to maintain the integrity & consistency of such resources, effectively & over a long period of time– and how to detect & repair gaps or inconsistencies

• so, beyond the “translation” step (obj 5.1):– design, authoring, versioning & maintenance of (multiple, parallel,

interconnected …) websites, portals or repositories– in a distributed collaborative environment, possibly across

organisational boundaries

• so as to turn a multi-million endeavour into a viable proposition for a much broader range of companies & administrations

Page 45: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 45

ICT-PSP, 5.1machine translation for

the multilingual Web

5.1 can be seen as a subset & central component of obj 5.3 (its “translation box”)

• different usages:

– web at large, enterprise, public information repositories …

• different users:

– teams as well as individuals, engineers as well as analysts, sales & marketing, language professionals, … you & me

• different content rich, information bound sectors, private & public

• quality depends on task & user

– from raw translation & “gisting” up to error-free translation

• two important conditions:

– widely recognised, well argued problem; clearly identified target community

– thorough validation in a given domain / for a given task volume metrics

Page 46: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 46

ICT-PSP, 5.2standards & best practices

Thematic network

• covers the same broad issues as 5.3

– “the web as THE vehicle for multilingual content & services”

• provides a forum for multilateral exchange of experience & consensus building

• structure & tasks to be defined by the proposers, indicative list:– bring together a meaningful subset of the main stakeholders, possibly

through their own groups & associations– ICT & language industries, content aggregators/distributors, e-services,

multinational agencies, industry & de-jure standards bodies …

– analyse current situation, identify gaps & bottlenecks; assess market failures if any, specify technical & non-technical conditions to be met and the respective actors

– establish roadmap (trends, requirements, dependencies …) for further developments in the coming years

– stimulate consensus & active involvement/coordination; take part in leading conferences, liaise with primary associations etc.

– explore means to promote best practice (conferences, portals, publications, training …) beyond current channels

– propose suitable follow-on actions

Page 47: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 47

ICT-PSPInstruments & Funding

• pilot B projects:

– min. 4 partners from 4 different countries

– 50% of eligible direct costs

– flat 30% overhead rate of personnel costs

• thematic networks:

– min. 7 partners from 7 different countries

– lump sum; for 3 years and 1+10 participants:

coordinator: 95 Keuro

other participants: 24 Keuro each

ec.europa.eu/information_society/activities/ict_psp/participating/index_en.htm

Page 48: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 48

Practical info

ICT-PSP Theme 5 – Multilingual Web

budget: 14 Meuro under Call 3

managed by: Unit E1

Email: [email protected]

EC contact: Mr Kimmo Rossi

• inquiries: from the call publication date (~Feb)

• pre-proposals: from publication until 3 weeks before the call closing date

Page 49: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 49

Events

Language Technology Days:

14-15 Jan 2009, Luxbg

ICT-PSP Info Day:

26 Jan 2009, Brussels (tbc)

Email: [email protected]

URL: cordis.europa.eu/fp7/ict/language-technologies/..

FP7-ICT: ../fp7-call4_en.html

ICT-PSP: ../cip-psp_en.html

Page 50: 1 Roberto Cencioni Kimmo Rossi Challenge 2 – Objective 2.2 Language based Interaction DG Information Society and Media Unit INFSO.E1 Language Technologies.

••• 50

Thank you!