YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

Uwe Rosemann

ICIC 2013 Vienna

Textual and non-textual objects:

Seamless access for scientists

Page 2: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

2

• Specialized Library for Architecture, Chemistry, Computer Science,

Mathematics, Physics, Engineering Technology

• Financed by Federal Government and all Federal States

• Member of the Leibniz Association

• Global supplier for scientific and technical

information

German National Library of Science and Technology (TIB)

Page 3: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

3

Global Network

TechLib

Page 4: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

4

Customers

71% 10%

Europe

14% 5%

World USA

Germany

Page 5: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

5

Main Services

• Provision of scientific content

• full texts, document delivery, interlibrary loan

• Scientific retrieval

• portal GetInfo

• Long-term preservation

• DOI-Service for research data

• Research and development

Page 6: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

6

Jim Gray, eScience Group, Microsoft Research

Changes in the scientific process

Page 7: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

7

A gap

• A widening gap in the scientific record between published

research in a text document and the data that underlies it

• As a result, datasets are

• difficult to discover

• difficult to access

• Scientific information gets lost

Page 8: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

8

Requirements - Politics

Knowledge is power.

Europe must manage the digital assets its researchers generate.

Page 9: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

9

Final report of the High Level Expert Group on Scientific Data.

„Riding the wave“ – How Europe can gain access

from the rising tide of scientific data

Page 10: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

10

Strategy – Move beyond text

Simulation

Scientific Films

3D Objects

Text

Research Data

Software

Page 11: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

11

Move beyond text – Consequences for TIB

• Research communities produce many types of scientific and technical

information

• Each has its own unique characteristics and life cycle

• Must become capable of accepting and managing new media formats

Page 12: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

12

Competence Center for Non-textual Materials I

• Develop a clear strategy for the use and integration of non-textual

materials at the TIB

• Systematically collect non-textual materials from research and teaching

• Define, integrate and establish technical infrastructure

• Define and establish workflows for indexing, cataloguing, digital

preservation, DOI names, licencing

Page 13: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

13

Competence Center for Non-textual Materials II

• Develop innovative media-specific portals enabled by e.g. an automated

video analysis with scene, speech, text and image recognition

• Linking non-textual materials to other research information such as full

texts and research data via the specialist portal GetInfo

• Engage in communities, provide support and advice to media providers

TIB will establish its own research capacity

Page 14: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

14

• Infrastructure for research data

• Visual search tools for AV-media

• 3D-Objects

• chemOCR

How have we been preparing ?

Page 15: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

15

• In 2005, the TIB became a non-commercial DOI registration agency

for research data

• In 2010, the TIB became co-founder of the international DataCite

consortium to establish easier access to scientific research data on the

Internet

Mission

• Citability of research data

• High visibility of the data

• Easy re-use and verification of the data sets

• Increasing quality of published papers

Collaboration – Research Data

Page 16: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

16

DataCite Members

Page 17: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

17

Example: EHEC virus

Page 18: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

18

Example: EHEC virus

Page 19: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

19

DOI Services

• Contracts with 60 data centres

• Research Institutes

• Universities

• Libraries

• Publisher

• 776.454 DOI registrations

• 22.533 up to September 2013

Page 20: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

20

Research data – Further developments

• KomFor

• Centre of Expertise for Research Data from the „Earth and

Environment“ project

• RADAR

• RADAR - Research Data Repositorium

• Visual Analysis

• VisInfo Methods

Page 21: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

21

Zeit [h] T [°C] 1 12 2 13 3 12 4 12 5 13 6 35 7 17 8 11 9 10

10 12 11 13 12 13 13 12 14 12 15 12 16 11 17 11 18 10 19 10 20 11 21 11 22 10 23 12 24 12

Numerical data

Page 22: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

22

Visual access to research data

Page 23: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

23

• Infrastructure for research data

• Visual search tools for AV-media

• 3D-Objects

• chemOCR

How have we been preparing ?

Page 24: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

24

TIB‘s portal for audiovisual media

Project Development of a portal for audiovisual media

Aim Improve access to AV-Media

Time July 2011 – December 2013

Partner Hasso-Plattner Institut for Softwaresystemtechnology GmbH

Page 25: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

25

How do I find what I‘m looking for in videos?

Today: Manual annotation of the whole video

TIB‘s portal for audiovisual media

Metadata

• Titel

• Author

• Description

• Publisher

• Publication year

• Rightsholder

• …..

Page 26: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

26

source: Scorupka, Sascha, Experiment der Woche, 2011

Future: Manual Annotation plus content-based information

1. Speech

2. Visual features

e.g. Indoor, Experiment, Technology

4. Structural Information

Scenes, Shots, Segments

3. Textual information Leibniz University Hannover

TIB‘s portal for audiovisual media

Page 27: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

27

TIB‘s portal for audiovisual media

Media analysis process

Upload

Page 28: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

28

TIB‘s portal for audiovisual media

Scene recognition

Hard cut

Kopf, S. Computergestützte Inhaltsanalyse von digitalen Videoarchiven, Mannheim. 2006

Automatic cut detection

→ luminance / contrast

→ colour distribution / colour

histogramm

→ edges

Page 29: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

29

TIB‘s portal for audiovisual media

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering

this work is copy right ed nine teen thirty six

Automatic speech recognition

Quality of results is dependent upon

• quality of the speaker

• dialects

• background noises

• voice overlaps

Page 30: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

30

TIB‘s portal for audiovisual media

Intelligent Character Recognition

Intelligent Character Recognition

(ICR)

• Character/Logo Detection

• Character Filtering

• Character Recognition

Page 31: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

31

Method of analysis

Image recognition

Interview, experiment,

animation, lecture

Extracted data is

converted into text

TIB‘s portal for audiovisual media

Automated analysis: Image recognition

Page 32: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

32

Visual Concepts

Graphical : Animation

Graphical : Drawing

Graphical : Diagram

Real : Outdoor

Real : Indoor

Real : Lecture /

Conference

Real : Interview

Real : Buildings ...

TIB‘s portal for audiovisual media

Machine learning

using visual features Keyframes Annotation

Page 33: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

33

TIB‘s portal for audiovisual media

Page 34: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

34

• Infrastructure for research data

• Visual search tools for AV-media

• 3D Objects

• chemOCR

How have we been preparing?

Page 35: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

35 35

3D Objects – an excursion to Architecture

Page 36: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

36

content based indexing

visual search

Visual search tools

Page 37: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

37

segmentation with

form-primitives

extraction of

room connectivity

graphs

Content based indexing

Page 38: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

38

3D sketch attributed graph

result visualization

Visual search

Page 39: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

39

Further developments

Page 40: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

40

• Infrastructure for research data

• Visual search tools for AV-media

• 3D Objects

• chemOCR

How have we been preparing ?

Page 41: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

41

Search for chemical structures – how?

?

Chemists are used to drawing

Information retrieval in Chemistry

Page 42: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

42

Table with reaction scheme

2a-i: Derivates from the reaction

Chemical structure

Reaction scheme

Chemical Names

Linked entities from the table

Textual and non-textual chemical information

Page 43: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

43

image data chemical structure data

CLiDE chemOCR

Non-textual data processing – chemOCR

Page 44: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

44

Information retrieval in chemistry Text AND formulas

Page 45: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

45

Further subjects

• Open Science Lab

• Ontology

Page 46: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

46

Dissemination of scientific and technical information has been a

foundational mission.

The methods have completely changed, but the mission

remains the same.

Conclusion

Page 47: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

47

Ultimate Goal:

Interlinking and Search Across All

Types of Digital Assets.

Conclusion

Page 48: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

48

GetInfo – Portal for Science and Technology

• 58 m metadata in internal index

• 390 m metadata in external sources

• 900.000 pdf fulltexts

• Data, AV-Media, 3D Objects

Page 49: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

49

Development of media-specific portals

BEREITSTELLU

NG

Probado 3D Portal for audiovisual Media

Page 50: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

50

Questions?


Related Documents