Top Banner
Moving beyond the box: Moving beyond the box: automating the digitisation of automating the digitisation of insect collections insect collections
20

Moving beyond the box: automating the digitisation of insect collections

Jul 13, 2015

Download

Science

Vincent Smith
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Moving beyond the box: automating the digitisation of insect collections

Moving beyond the box: Moving beyond the box: automating the digitisation of automating the digitisation of

insect collectionsinsect collections

Page 2: Moving beyond the box: automating the digitisation of insect collections

Blagoderov et al (2012) No specimen left behind: industrial scale digitization of natural history collections. ZooKeys 209: 131–146, doi: 10.3897/zookeys.209.3178

Page 3: Moving beyond the box: automating the digitisation of insect collections

Drawer level imaging is (mostly) a solved problem

1. Place drawer 2. Scan 3. Stitch

Page 4: Moving beyond the box: automating the digitisation of insect collections

Result: High resolution composite image

Drawer level imaging is (mostly) a solved problem

• Fast (5 mins per drawer)• High resolution (circa 500MB per image)• No specimen handling

Page 5: Moving beyond the box: automating the digitisation of insect collections

But, two key problems remain…

Synchronisation Label data

Keeping the physical & digital copies in sync

Capturing data from multiple pinned labels

Page 6: Moving beyond the box: automating the digitisation of insect collections

• Don’t worry about it, re-image as required

• Lock down the drawers

• Crop-out each specimen image

– Automate the cropping process

– Link each specimen to its digital image

– Make it easy to collect label data

Approaches to the synchronisation problem

The only practical solution,but a new rate limiting step

Page 7: Moving beyond the box: automating the digitisation of insect collections

Annotation software, NHM working with SmartDrive:

Initial supporting software

• No automated cropping

• Manually link images & specimens

• Poor UX/UI

• Closed source and proprietary

• Not cross-platform (Windows only)

A good first step to understanding the problems

Page 8: Moving beyond the box: automating the digitisation of insect collections

Automating specimen segmentation

Starting image

Auto-segment

Mark errors

Correct

Work with Pieter Holtzhausen and Stéfan van der Walt (Stellenbosch University)Software: Inselect, written in Python

Page 9: Moving beyond the box: automating the digitisation of insect collections

Original Primary segmentation(contrast based)

Segmentation methods

Secondary(seed growing)

Page 10: Moving beyond the box: automating the digitisation of insect collections

Inselect

http://naturalhistorymuseum.github.io/inselect/

• Currently pre-release (alpha)• Automatically detects specimens• Creates bounding boxes for

cropping and exporting images• Rapid annotation interface• Persistent settings & keyboard

shortcuts• Data export in JSON format• Open source & modular• Python based (OpenCV, scikit-

image libraries)• Windows, OSX & Linux

Automated recognition, cropping and annotation of specimens

Page 11: Moving beyond the box: automating the digitisation of insect collections

Whole Drawer image

Inselect: segmentation of specimen images

Auto-segmented images in sidebar

Whole Drawer image

Page 12: Moving beyond the box: automating the digitisation of insect collections

Easy to spot & correct errors

Secondary re-segmentation (seed growing)

Whole Drawer image

Page 13: Moving beyond the box: automating the digitisation of insect collections

Whole Drawer image

Easy to spot & correct errors

Secondary re-segmentation (seed growing)

Page 14: Moving beyond the box: automating the digitisation of insect collections

Works on slides & pinned insects

Whole Drawer image Also testing mineral & fossil samples

Page 15: Moving beyond the box: automating the digitisation of insect collections

Planned UX / UI Enhancements

Whole Drawer image

Unit tray recognition

Multi- specimen annotation

Plug in controlled vocabulary services

Page 16: Moving beyond the box: automating the digitisation of insect collections

1D & 2D barcode recognition

Whole Drawer image

• Recognition and reading of 1D & 2D matrix barcodes from images

• Different physical requirements (smallest 6x6mm matrix, readable via handheld scanners)

• Testing open source & commercial libraries at different scan resolutions

Initial results• Commercial solutions outperform

open source• Max. read success 94%• Idiosyncratic results (different

results on different OS)• Testing continues…

Page 17: Moving beyond the box: automating the digitisation of insect collections

The Holy Grail: label imaging & text recognition

Whole Drawer image

Chauliodes pectinicornis

Do

rsal

Cau

dal

Fro

nta

l

Reconstructed labels

Page 18: Moving beyond the box: automating the digitisation of insect collections

The Holy Grail: label imaging & text recognition

Whole Drawer image

Agulla astuta

Do

rsal

Cau

dal

Lat

eral

Reconstructed labels

Page 19: Moving beyond the box: automating the digitisation of insect collections

Approaches to label imaging pinned specimens

Whole Drawer image

Could be incorporated as part of the barcode dispensing process1. Barcode dispenser & scanner

(two sides barcode labels)

2. Freshly pinned barcode label

3. Other collection labels

4. Multiple label imaging cameras

5. Assemble labels from composite images

Page 20: Moving beyond the box: automating the digitisation of insect collections

Acknowledgements

Whole Drawer image

Segmentation algorithm & app. developmentStefan van der Walt and Pieter Holtzhausen

Application developmentAlice Heaton

Barcode recognition & testingLawrence Hudson

Analysis & testingLaurence Livermore, Vladimir Blagoderov and Ben Price

Initial specification and fundingVince Smith