Top Banner
caDSR Freestyle Search June 11, 2009
16

CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

Jan 02, 2016

Download

Documents

Clyde Wright
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

caDSR Freestyle SearchcaDSR Freestyle Search

June 11, 2009

Page 2: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

2

caDSR Freestyle SearchcaDSR Freestyle Search

• Overview• Architecture• Implementation• Dependencies• Futures

Page 3: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

3

caDSR Freestyle Search - OverviewcaDSR Freestyle Search - Overview

• Provides a “Google” like search across caDSR• Case Insensitive• Results limited to only highest ranked matches –

does *not* normally return all matches• Match weight a result of term sequence,

intervening terms, number of occurrences, Workflow Status, Registration Status, Administered Item type, etc

• Results sorted descending by weight, i.e. heaviest match appears at the top of the list

• Does not require user to know caDSR structure or objects/attributes to perform searches

Page 4: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

4

caDSR Freestyle Search - OverviewcaDSR Freestyle Search - Overview

• Stakeholders:– Form Designers– Modelers– Developers– Analysts– Clinicians– Statisticians– Researchers– Curators– caBIG– NCI

Page 5: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

5

caDSR Freestyle Search - ArchitecturecaDSR Freestyle Search - Architecture

• Technologies– Java 1.5– Javascript– HTML 4– JDBC– Struts– EVS 4.2

Page 6: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

6

caDSR Freestyle Search - ArchitecturecaDSR Freestyle Search - Architecture

• Struts / JSP / HTMLView

• JBossController

• Java 1.5Application

• Class, InterfaceModel

• JDBC, PL/SQL, ANSI SQLDatabase

• Oracle 10gPersist

Page 7: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

7

caDSR Freestyle Search - ArchitecturecaDSR Freestyle Search - Architecture

• Auto-deploy– Deployable via Anthill– Ant –DPROP.FILE=… build-all deploy

• SCM– CVS– .cvsignore for all transient files– One file, no duplicates, e.g. template.web.xml vs. web.xml

• All files placed in deployment-artifacts• Production deployment artifacts

– Accessible via links in email from Anthill– Files hosted on GForge for distribution– URL references to GForge hosting for Wiki, Download, etc

Page 8: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

8

caDSR Freestyle Search - ArchitecturecaDSR Freestyle Search - Architecture

• Jboss/freestyle.war– Web Browser UI– Passes input to JAR and formats result in HTML, pure UI

layer• Gforge/freestylesearch.jar

– API interface for searches, options, etc• Bin/autorun.sh

– Deploys to /local/content/freestyle/bin/.– Automated job to update search indices– Scheduled and launched by CRON every morning at 3:00

am and every hour between 8:00 am and 5:00 pm

Page 9: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

9

caDSR Freestyle Search - ArchitecturecaDSR Freestyle Search - Architecture

tool name

• FREESTYLE• SENTINEL• …

property• URL• EMAIL• …

value• http://freestyle..•[email protected]• …

…• …• …• …

Tool Options Table

• Tool options table hosts configuration values beyond 3rd party requirements, e.g. XML

• Dynamic– Values are read as needed – user sees changes in real time– Values cached when new session created – user must close

window– Values never cached with application – requires restart of

JBoss

Page 10: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

10

caDSR Freestyle Search - ArchitecturecaDSR Freestyle Search - Architecture

• SQL script updates/sets tool option values– Updates limited to FREESTYLE tool name

• SQL may check database schema during deployment

– E.g. When a new column is added to a table/view a SELECT using the column name will throw an error if the database is not updated before deploying the tool

• SQL may *never* alter schema• SQL may perform data migration• Must be coordinated and negotiated with caDSR

database deployment scripts• Index updates write current timestamp on

completion

Page 11: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

11

caDSR Freestyle Search - ImplementationcaDSR Freestyle Search - Implementation

• Project Structure– Conf

Configuration files, e.g. XML, which require value substitution during build and deployment

– Db-sqlScripts to correct errors in index tables

– DocPatterned after phases in development lifecycle with the addition of “Administration” for all documentation specific to NCI policies and processes and not directly pertinent to the product features

• Administration• Construction• Elaboration• Inception• Transition

– LibJAR files needed for building the project *but* not included in the deployment, e.g. ojdbc14.jar is deployment on Jboss and not packaged in project WAR but must be present to compile and build the WAR, allows for the separation of the build machine and the deployment target machine

Page 12: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

12

caDSR Freestyle Search - ImplementationcaDSR Freestyle Search - Implementation

• Project Structure– Scripts

Console scripts to update index tables

– SrcJava source, more details follow

– WebRootThe deployed freestyle.war content

• Css• Html• Images• Js• Jsp• Meta-inf• Web-inf

– Lib– Tld

Page 13: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

13

caDSR Freestyle Search - ImplementationcaDSR Freestyle Search - Implementation

• Packages– gov.nih.nci.cadsr.freestylesearch.test

Automated tests– gov.nih.nci.cadsr. freestylesearch.tool

Main business logic– gov.nih.nci.cadsr. freestylesearch.ui

Web Browser UI using Struts– gov.nih.nci.cadsr. freestylesearch.utl

Utility features, e.g. Search, results object types, etc

• Search entry utl/Search.java• Index table

– Update entry point utl/Seed.java– Configuration cont/template.seed.xml

Page 14: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

14

caDSR Freestyle Search - ImplementationcaDSR Freestyle Search - Implementation

• Logging– freestylesearch_log.txt

Jboss messages from gov.nih.nci.cadsr.freestylesearch.*– Server.log

Jboss messages from 3rd party packages, e.g. struts– Seed_log.txt

Messages from the update to the index tables

Page 15: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

15

caDSR Freestyle Search - DependenciescaDSR Freestyle Search - Dependencies

• caDSR API– The search results are returned in Freestyle defined class

objects or in AdministeredItem derived class objects per the search method used.

• Oracle 10g– The weight algorithm relies on calculations performed in

SQL, this is necessary to avoid sending large amounts of data to the web server for weight calculations.

Page 16: CaDSR Freestyle Search June 11, 2009. caDSR Freestyle Search Overview Architecture Implementation Dependencies Futures 2.

16

caDSR Freestyle Search - FuturescaDSR Freestyle Search - Futures

• Upgrade the caDSR API as needed• Research use of Lucene• Add “sounds like” matching• Add singular/plural matching• Add wildcard support• Add Concept matching• Add selection of indirect Admin Item type, e.g.

return all DE where DEC is …• Improve performance (possibly define database

indexes on index table columns)• Add weight calculation customizations, e.g.

matches in long_name should be 2x all other columns