New Task GroupCRIS Architecture & Development
Maximilian StempfhuberRWTH Aachen University [email protected]
AgendaA view to research informationThe role of CERIF
As a data modelIn the CRIS development process
Why a new Task Group?Task Group’s Mission
Information & Research Process
Work programmWork programm
Project proposalProject proposal
ProjectProject
ResultsResults
TransferTransfer
Wealth creationWealth creation
Research organisationsResearch programmesResearch strategyState of the ArtProposal Management
Project management(Human)ResourcesInfrastructureExperimentsPublication managementCommunicationResearch dataSoftwarePrizes, patents
ExpertiseCommercial products
KnowledgeWealthExcellence
ResearchInform
ation
CRIS Semantic & Temporal Aspects
ProjectProject
Org. UnitOrg. Unit
FundingProgramme
FundingProgramme
PersonPerson
ExpertiseCV
ResultsResults
PublicationsEventsSoftwarePatents
Current Research Information System
CERIF A Data Model for CRISCommon European Research Information Format
Equipment
ProjectProject OrganisationOrganisation
Service
Funding Programme
Patent
Skills
CV
Product
Event
PersonPerson
Classification(Semantics )
Classification(Semantics )
Publication
CERIF A Data Model for CRISCommon European Research Information Format
• Entity Relationship Model• Generators for several DBMS• CERIF-XML as exchange format• Code of Good Practice
• Commercial software systems• Proprietary implementations
Same Model…… different results
Current ProcessRESPONSIBILITY PROCESS
RIS proposal
RISDesign
InformationProcessing
Reject
Reject
Reject
Accept
Accept
Accept
RTD-PromoterSteering Group
Development FunctionPublishing Function
RTD-PromoterSteering Group
Collection FunctionProduction FunctionMarketing Function
RTD-Promoter
ConceptAnyone
Project Manager
Reject
Accept
RTD-PromoterSteering Group
Publishing/Distribution
Maintenance
CONTROLS
RIS Proposal - Definition of Purpose - Identification of Users - Definition of Content
RIS Design Plan - Database Specification - Structure and Presentation - Classification and Indexing - Search and Navigation
RIS Information Processing Plan - Data Collection Plan - Collection Guidelines - Quality Control Plan - Acceptance Test Plan
Distribution PlanMarketing Plan (revisited)Economoic model (implementation)
Maintenance Plan /Acceptance
Concept Acceptance
RIS Proposal Acceptance
RIS Development Acceptance
Information processing acceptanceStructuredOutput
Ong
oing
Proc
ess
Rev
iew
Marketing Plan- Market Analysis- Cost benefit analysis/ economic model
Code of Good Practice• Organizational view• Covers whole process• Waterfall-like
Missing Aspects• (Software) Architecture• Technology• Reference Implementation
Looking Beyond…… the CRIS domain• Administrative systems at the institution• Local information systems (OAR etc.)• Community systems (ResearchGate etc.)• Clusters of Excellence (Idea League)• Virtual Organizations (Fraunhofer,
Helmholtz, Leibniz, Max-Planck)
CERIF-CRIS Connectivity
CERIF-CRISProjects, Persons, Org. Units,
Publications, Events,Research Programmes, etc.
CERIF-CRISProjects, Persons, Org. Units,
Publications, Events,Research Programmes, etc.
CERIF-CRISProjects, Persons, Org. Units,
Publications, Events,Research Programmes, etc.
CERIF-XML CERIF-XML
InstitutionalRepository
Research DataRepository
Finance HumanResources
ProjectManagement
Community CRIS
euroCRIS Strategy
euroCRISeuroCRIS
Enhance existing CRIS
Fill gaps withnew CRIS
Connect CRISwith a commonCERIF wrapper
Create standardized,reusable services
The Gap…
CERIF
Concrete System
Code of Good Practice
high
low
Ag
reem
ents
, S
tan
dar
ds,
Bes
t P
ract
ices
, R
e-U
se
… between model and implementation
What’s Missing?
Operating System
Database Management System
CERIF
Data Access Layer
Business Logic
User Interface
CERIF-XML
Sear
ch, H
arve
sting
, Ser
vice
s
Code
of G
ood
Prac
tice
Why is it important?
20CERIF
: 80 Development & Testing
What can be gained…… for euroCRIS as an organization?• Community building• Exchange• Reuse• Evolution• Spreading ideas & Connectivity… beyond CERIF
What can be gained…… for euroCRIS members?• Using building blocks• Reducing development & testing• Getting additional functionality• Opening ones system & content… even in combination with
commercial software
Requirements• Requirements engineering• (Functional) Software specification• Code of Good Practice (Updated)• Best Practice Examples / DRIS (Updated)• Available (commercial) solutions
Database Systems• Paradigms (Relational, Object-Oriented,
XML, multi-dimensional DBMS)• Systems (IBM DB2, Oracle, PostgreSQL;
commercial vs. Open Source)• Interfaces (ODBC, JDBC, Perl DBI, PEAR)• Query languages (SQL, OQL, XQuery)• Schema evolution / migration
Database Abstraction• Separating software architecture from
the (physical) database model• Encapsulation vs. normalization• Object-Relational-Mapping (ORM)• Schema evolution / migration• Convention over configuration (Coding
by convention) & tool support
Programming & Managing
• Re-use of modules and libraries• Generating CRIS Open Source code base• Share experience with colleagues
– Scalability (e.g. middleware)– Reliability (e.g. components)– Integrated Development Environments (IDE)– Development process (SCRUM, V-Model, MDA, …)
Software Architecture• Permanent evolution vs. re-use
– Development philosophy → architecture– Domain modeling → architecture– Software frameworks → architecture– Tools support → architecture– Programming languages → architecture
• Current buzz words: SOA/REST, Cloud Computing, RIA, BPM, Portal/CMS
Functional Modules• Self containment• Standardized interfaces• Standardized functionality• Standardized input (e.g. CERIF-XML)• Standardized output?• CRIS plug-in architecture needed
Workflow• Business Process Modeling (BPM)• Workflows at the UI level• Quality assurance in CRIS• Event/data-driven services• Drives re-usable software modules (e.g.
input verification, data acquisition) & processes
User Interface• Common / consistent user experience• Re-use of interaction patterns• Sharing solutions (e.g. CSS frameworks)• Sharing knowledge (e.g. accessibility)• Integration CRISs and services
Information Design• Common ways for expressing
– Semantic relationships– Temporal aspects– Qualities & quantities
• Software modules for visualizations– Network graphs– Timelines– Charts, …
• Experiences with commercial software
Statistics & Reporting• Defining recurring information needs• Standardizing on basic data formats• Statistics / reporting as a (re-usable /
commercial) service• Software modules• Layout templates (e.g. XSLT, XML FO)
External Access• Defining public CRIS services
– Functional specification– Interface specification– I/O format specification
• Services– Searching for entities– Data analysis / information extraction
Data Exchange• Harvesting interfaces• Entity extraction• Replication• Federation• Schema mapping
TG Roadmap• Establishing TG Mission• Recruiting TG Members• Initial Survey: Where are we now?
Where are we going?– Technologies used (DMBS, languages etc.)– Methodologies used (SOA, SCRUM, outsourcing etc.)– Gap analysis: Topics for support & exchange, common
modeling of CRIS architectures, abstraction layers, module specifications etc.