Top Banner
Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 1/28 Scanning woes and war stories ELAG 2014 Toke Eskildsen IT nerd (boss says “System Architect”)
28

Scanning Woes and War Stories

Jan 29, 2018

Download

Toke Eskildsen
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 1/28

Scanning woes and

war stories

ELAG 2014

Toke Eskildsen IT nerd (boss says “System Architect”)

Page 2: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 2/28

State and University Library

Denmark

“Everything onlinein 2020”

- Vision

Page 3: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 3/28

We would like plentiful, raw, visible, solid pixels

Page 4: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 4/28

Zoom

Not like this! Like this!

Page 5: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 5/28

Histogram

Reference

Page 6: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 6/28

Adjust Color Levels

Page 7: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 7/28

That's a nice scan!

Page 8: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 8/28

A shame it was sharpened

Haloes around text indicates sharpening

Reference

Page 9: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 9/28

But this one seems fine?

Page 10: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 10/28

Sharpened and JPEG compressed

Square areas and localized noise indicates JPEG compression

Reference

Page 11: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 11/28

Lossless!We Promise!

Lossless workflow

Page 12: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 12/28

A chain is only as strong...

Lossless!We Promise!

JPEG

Page 13: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 13/28

This one? Please?

Page 14: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 14/28

JPEG 2000 compression

JPEG 2000 lossy compression signs are best learned from multiple examples

Reference

Page 15: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 15/28

Burnout

Sharp spikes at either end of the histogram indicates burnout

Reference

Page 16: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 16/28

Burnout - visualization

Page 17: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 17/28

But we need the dark to read!

Page 18: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 18/28

No you don't!

Visualisation of ALTO-OCR files: https://github.com/tokee/quack

Page 19: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 19/28

¦ “. i N ¦ M i; sk s, 011 el en al vei dens -t oi - te kom ei ner milen t'i bi -i i. 1 1 id iiiip .il -ukkersvg, I i ; \\ . : i .i i mod! , 1 1 km ikui i ene i !i;i • 1. t.v U.ilisk kali! Dit ei l u 't eksi isk.a bet /.> . i : 1 1 'I I l'b.! : man 'lit le.l b der ledes at Lundbecks tulbpeie forsk mi I I i 1 ' kt ' '1 F.va Sti 11 iess m i m ! o! i i . ! i , v! I ; vende l.epemiddel • ¦ ! a a 1. 1 >!:' a ! t \\ p. ¦ 2 tliahvtcs ; n 'i i te '. oi st, ¦ Klm.sKe i ol sop I n slik kel s vpe pat len! er Di amerikanske 1 1 1 1 1 1 1 i , F! >A il pi ve";

ABBYY FineReader 10.5

Some software upgrades matter!Novo Nordisk, som er en af verdens største koncerner inden for behandling af sukkersyge, bliver nu mødt af konkurrence fra en ny dansk kant. Det er biotekselskabet Zea - land Pharmaceuticals, der ledes af Lundbecks tidligere forsk - ningsdirektør Eva Steiness, som fører et nyt lovende lægemiddel til behandling af type 2-diabetes frem til de første kliniske forsøg på sukkersyge-patienter. De amerikanske sundheds - myndigheder, FDA, har givet

ABBYY FineReader 11

Page 20: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 20/28

Post-processing

Holes in the histogram indicates leveling / exposure / contrast

Page 21: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 21/28

Beware: Post-processing + JPEG

Post-processing indicators becomes less distinct when the image is JPEG compressed

Reference

Page 22: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 22/28

Notice the lines?

300DPI @ 20x enlargement 300DPI @ 19x enlargement

Eve

ry o

the

r lin

e

Page 23: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 23/28

Time for scanner calibrationG E N E R A L C A M E R A S E T T I N G S:

Camera Model No.: P2-20-08K40Camera Serial No.: 11041074Camera Network ID: 0Network Message Mode: disabled

Firmware Design Rev.: 03-081-20017-01 Aug 29 2007DSP Design Rev.: 03-056-20013-00

SETTINGS FOR UNCALIBRATED MODE:

Analog Gain (dB): +0.0 +0.0Analog Offset: 634 630

SETTINGS FOR CALIBRATED MODE:

Analog Gain (dB): -0.4 -0.5Analog Offset: 624 630Digital Offset: 0 0Calibration Status: FPN [uncalibrated] PRNU [calibrated]

SETTINGS COMMON TO CALIBRATED AND UNCALIBRATED MODES:

System Gain: 0 0Background Subtract: 0 0

Pretrigger: 0Number Of Line Samples: 32Video Mode: calibratedData Mode: 0Exposure Mode: 4

SYNC Frequency: external (9398.09) HzExposure Time: external

End-Of-Line Sequence: onUpper Threshold: 240Lower Threshold: 15Region Of Interest: 0001 - 8192

OK>

Systematic alternating lines indicates that the scanner should be calibrated

Page 24: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 24/28

Last one is tricky

Page 25: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 25/28

Zoom 9000

Page 26: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 26/28

New tool – Grid lines

Page 27: Scanning Woes and War Stories

Toke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 27/28

Upscaling

I'm working on a small tool for detecting scaling. Very alpha: https://github.com/tokee/telltale

Page 28: Scanning Woes and War Stories

BibTekConf 2013 - Lucene/Solr samsøgning og skaleringToke Eskildsen [email protected] ELAG 2014 – Scanning woes and war stories - 28/28

Are your scans just fine?

Toke Eskildsen, Statsbibliotekethttp://en.statsbiblioteket.dk/newsdigi

http://[email protected]

@TokeEskildsen