© 2015 IBM Corporation
Session 3723 What’s behind a high quality web API? Ensure your APIs are more than just a pretty façade Kim Clark ([email protected]) Brian Petrini ([email protected])
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API • Summary
1
Evolving exposure of business function
Service Exposure (enterprise)
Low Level APIs (platform/package)
Application Integration (application)
Service/API Exposure (external known
consumers)
External API Exposure (public)
Future?
Differentiating between web APIs, SOA, and integration http://www.slideshare.net/kimjclark/inter-connect2015-3723whatsbehindahighqualitywebapiv110 Related article on developerWorks http://www.ibm.com/developerworks/websphere/library/techarticles/1503_clark/1305_clark.html
Is there a market for your data? Example: Apps relating to “london transport”
3
Ent
erpr
ise
Bou
ndar
y D
MZ
Different between the internal and external service consumer
Service Exposure (external)
Service Exposure (internal)
Operational Systems (Applications & Data)
Service Exposure (enterprise)
Service/API Exposure (external)
Consumers (internal)
There may be only a handful of well understood internal
consumer applications
…and there could be hundreds of “experimenters”
There may 10s of external consumer applications
Consumer base for APIs • Internal
• Private – An evolution of SOA – Hard to enforce/fund decoupling with heavyweight SOA. Owners of
systems often known. – APIs enable more light touch decoupling. Lowers barrier to entry. – Borrows technology create through funded external API initiatives
• External • Partners/channels – Growing area, but arguably one of the most important/powerful – Channel specific becoming more common. “Intent based design”. – Close collaboration between API provider and partners
• Public – Highly visible examples, but a subset of the market – One size fits all APIs. Fully decoupled. – Lives or dies on strength of its ease of use, and its community
5
Firewall
Firewall
API Gateway
API exposure in an ideal world
Consider: • Availability • Performance • Transactionality • Error Handling • Granularity • Security models • …and more
API API API
New System
Existing System
Existing System
Existing systems hardly ever provide interface
characteristics suitable for direct consumption as an
external API
…or maybe not
7
Resolving differences in interface characteristics
Interface Characteristics Integrity Security Reliability Error handling
Data Technical interface Interaction type Performance
Interface Characteristics Integrity Security Reliability Error handling
Data Technical interface Interaction type Performance
Integration is about handling
the differences between requester and provider
Integration Patterns
Capturing integration complexity using interface characteristics http://www.ibm.com/developerworks/websphere/techjournal/1112_clark/1112_clark.html
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API • Practical implementation – products, standards, features
8
Most web APIs are at this level
…today
The basics of REST in web APIs
Richardson Maturity Model Level 0 – “The swamp of POX”
Plain Old XML payloads, not making use of HTTP’s strengths. Includes SOAP/HTTP
Level 1 – Resources as URIs www.example.com/customers?surname=clark www.example.com/customers/12345
Level 2 – HTTP Verbs
POST www.example.com/customers (to create) GET www.example.com/customers/12345 (to read) PUT www.example.com/customers/12345 (to update) DELETE www.example.com/customers/12345 (to delete) Expectation is correct use of HTTP headers and error codes too
Level 3 – Hypermedia Controls Links in the response to provide behavioral information to the caller.
A few examples of API related literature!
10
10 reasons developers hate your API http://www.slideshare.net/jmusser/ten-reasons-developershateyourapi
Martin Fowler: Richardson Maturity Model http://martinfowler.com/articles/richardsonMaturityModel.html
Martin Fowler: Blogs relating to API Design http://martinfowler.com/tags/API%20design.html
Gov.uk: APIs - Using and creating Application Programming Interfaces https://www.gov.uk/service-manual/making-software/apis.html
White House Web API Standards https://github.com/WhiteHouse/api-standards
Designing HTTP interfaces and restful web services http://munich2012.drupal.org/program/sessions/designing-http-interfaces-and-restful-web-services.html
Stop Designing Fragile Web APIs http://mathieu.fenniak.net/stop-designing-fragile-web-apis
The Web API Checklist — 43 Things To Think About When Designing, Testing, and Releasing your API http://mathieu.fenniak.net/the-api-checklist
Ain’t Nobody Got Time For That: API Versioning https://mathieu.fenniak.net/aint-nobody-got-time-for-that-api-versioning
IBM IMPACT 2014: API Design Best Practices https://www-950.ibm.com/events/wwe/impact/impact2014cms.nsf/download/k727d15d563a863a8145f0dfbce7/$FILE/Impact2014_1496.pdf
API Craft - community of API practitioners https://groups.google.com/forum/#!forum/api-craft
API Academy: API Design http://www.apiacademy.co/lessons/api-design/api-design-basics
The RESTful CookBook http://restcookbook.com
The Amiable API http://theamiableapi.com
Toptal: 5 Golden Rules for Great Web API Design http://www.toptal.com/api-developers/5-golden-rules-for-designing-a-great-web-api
Github API standards https://github.com/18f/api-standards
Best practices for a pragmatic restful API http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
Codeplanet restful API design principles http://codeplanet.io/principles-good-restful-api-design
ZDNet: 5 ways to make your APIs more attractive to developers http://www.zdnet.com/five-ways-to-make-your-apis-more-attractive-to-developers-7000034557/
Tech Crunch rules for API management http://techcrunch.com/2012/11/11/5-rules-for-api-management
Effective service and API management http://www.mulesoft.com/lp/ebook/api/effective-service-api-management
Crafting Interfaces that Developers Love https://pages.apigee.com/web-api-design-website-h-ebook-registration.html
HAL Specification http://stateless.co/hal_specification.html
A lot has been written about web API exposure. This is a random selection.
What goes on beneath the surface to satisfy the web APIs requirements? Well that’s just magic…isn’t it?
Good API practices and their effects on implementation
Important practices around documentation, marketing, support relate primarily to how the API is exposed. But other issues also have a bearing on the quality of the API: • Robustness • Consistency • Security models • Performance requirements • Granularity • Versioning • Dependencies • Concurrent access • and many more…
11
Excellent summary of common concerns: 10 Reasons Developers Hate Your API (and what to do about it) John Musser: Founder of ProgrammableWeb http://www.slideshare.net/jmusser/ten-reasons-developershateyourapi
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API
1. Architectural end to end view 2. Performance 3. Data consistency 4. Granularity 5. Security
• Summary
12
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API
1. Architectural end to end view 2. Performance 3. Data consistency 4. Granularity 5. Security
• Summary
13
API implementation in the real world
Replicated Data store
API Client
API Gateway
Existing Systems
Existing interfaces
cannot satisfy API
requirement
API Client
API Gateway
Existing Systems
API Client
API Gateway
Refactored System
Refactored System
API Client
API Gateway
Existing Systems
Integration Hub
A compromise for “stale
reads”
The challenge
The world of infinite time and money
“Integration is a black
box” The
reality!
API Client
API Gateway
Existing System
Existing Systems
Integration Hub
Refactored System
Replicated Data store
Cache
Cache
Cache
Cache
Service Exposure E
nter
pris
e
Bou
ndar
y D
MZ
Additional requirements for external exposure - Introducing “API management”
Service Exposure – extended for external
Traffic Management
Security
Virtualisation
Visibility
Service Exposure (external)
Service Exposure (internal)
Operational Systems
Service Exposure (enterprise)
Service/API Exposure (external)
Partner Management
Accounting
Self administration
“Internet” auth
Threat management
Firewall
Firewall
API Gateway
From where are we exposing the web API?
A. Re-exposing an existing enterprise service
B. Exposing a new integration mediated through the hub
C. Re-exposing an interface already offered by a provider system.
Integration Hub
Provider
Adapter
Service Gateway
Service Exposure
Mediation
Provider
Adapter
API Exposure
Mediation
Provider
API Exposure API
Exposure DMZ
A B C
Using APIs internally
• Is the external network hop a latency issue? A security issue? How doe we get at the API more directly?
• Do we need to bring the API gateway inside? How would we do that if the API gateway is cloud based?
• Do I need a different SLA, or security model that we wouldn’t provide externally?
• Do I want more agile change on the API than we offer external consumers.
• Does my data need to be isolated strongly from external usage?
Service/API Gateway
Internal SOAP Exposure
Secure Gateway DMZ
Firewall
Firewall
Internal API Exposure
External API Exposure
Internal Consumer
External Consumer
If my external API is so good, why wouldn’t I want to use it internally too?
Internal API requirements are often very different from external ones. In simple circumstances they can share a gateway using separate domains, but more complex requirements will often lead to internal and external gateways.
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API
1. Architectural end to end view 2. Performance 3. Data consistency 4. Granularity 5. Security
• Summary
18
Performance: Types of cache and grid
Application Side-cache
In-line cache
Data store
Client
Application Basic external
Data store
Client
In memory Data Grid
Compute Grid
In-line cache looks like a data store to the application Side-cache requires application logic to check the cache before going to the data store
In memory cache provides fast native access, but is limited by the memory of the application. Basic external cache separates the cache from using local memory enabling larger data volumes and more flexibility in the cache topology at the cost of a network hop. Data grid partitions the data between grid containers to handle very large data volumes. Compute grid is more than a cache. Enables compute functions to be pushed out and performed close to the associated data providing massive parallelisation efficiencies.
WebSphere eXtreme Scale v8.6 Key Concepts and Usage Scenarios http://www.redbooks.ibm.com/abstracts/sg247683.html?Open
API Gateway
Performance: Caching In which layer should you cache?
Application Server
Data store
Device/ browser
CDN Server
Integration Hub
Read cache only. Should you terminate HTTPS at the CDN? Is asynchronous cache purge sufficient? What cache visibility do you have? Will you get re-use across regions? How will you test its effectiveness?
Must terminate HTTPS for full benefit. Read cache primarily How is cache invalidation performed?
Reduces load on API Gateway and all layers below. Closest geographical point-of-presence Uses existing internet capability (via HTTP headers)
Can’t share cache across users Cache invalidation can be very challenging Do you own the device app or have any controller over its design?
Reduces load on all other layers. App can potentially work offline Makes app extremely responsive
Reduces load on layers within enterprise. API specific caching independent of application. Cache consistent with API granularity
Reduces load on layers from application down. Enables state free scalability for reference data Writable cache options (with caution) Compositions can benefit from fine grained caching.
Reduces load on database Writable cache options with deep locking possibilities Cache with understanding of the application Application native data model can be used Data relationships within cache are acceptable Easiest point for accurate cache invalidation. Further scale with grid compute Preload closer to data store data model
No amount of caching at other levels is a substitute for a well designed, organised and tuned database. Modern databases (e.g. NoSQL, IMDB) need attention too.
No reduction in load on application or layers above. Database is the furthest distance from the client. Do you have access to adjust the database? Can you be sure you won’t destabilise the application?
Adds complexity to application build Data model often different to API, so translation at other layers. Change the application anyway or is it fixed? What’s the application code change cycle?
Writable cache patterns can interfere with application design Cache invalidation may require application knowledge.
Mobile App Server
Cons Pros
What breaks a simplistic caching strategy?
• Granularity • Pagination • Resource expansion, redaction
• Parameterisation • Search/query • Sorting
• Security • Encryption • Data Privacy
• Performance • Cache layer too high for re-use • Cache layer too low for latency
• Data consistency • Locking strategies • Purge/invalidation inconsistencies
21
Notice that the themes in this presentation are interrelated. Fixing one, often breaks another.
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API
1. Architectural end to end view 2. Performance 3. Data consistency 4. Granularity 5. Security
• Summary
22
API Gateway
Data consistency: The intermittent internet - a significant complication for web APIs
You are not guaranteed to receive a response from an HTTP request over the internet. If transient error is received, and the request was performing a state change (create, update, delete) to the data, then the state is unknown (“in-doubt”). Callers responsibility
Establish whether the request succeeded Re-try if necessary Notify the user
APIs responsibility Ensure that status can be tested easily Enable re-tries Reduce the likelihood of a lost response
Middleware
Client
HTTP via internet (unreliable)
HTTP over intranet or dedicated link (more reliable)
JDBC or similar over intranet (reliable and transactional)
?
?
Weren’t communication errors an issue for SOA too? Yes, but…
What might be meant by saying an interaction is “transactional”?
Internally transactional • Any internal actions are tied together in one transaction. • Either all or none of the internal actions complete. • If a communication error occurs during the request, the requester doesn’t know,
whether the transaction completed or not. • Typical of raw HTTP based interactions such as “RESTful” web APIs. Especially
relevant when communicating over the internet.
Called transactionally • Is “internally transactional” as noted above • Transactional boundary extends to the requester. • The requester always knows if the transaction completed. • Typical of transactional protocols such as JDBC. • Rarely available over the internet. More typically scoped to “within enterprise”
communication.
Participate in a distributed transaction • Is “internally transactional” as noted above • Can be “called transactionally” as noted above • The transaction can be on multiple systems can be combined with other transactions
on separate systems into a distributed transaction. • Typically implemented using “two phase commit”, requiring “XA” capability in
client and server, a transaction co-ordinator. • Ideally scoped no broader than “within application” due to recoverability
complexity, and locking risks.
Requester
Requester
Requester
Client side error handling for “Create”
Response type?
Success Permanent
error
Transient error
Duplicates matter?
Retry Create
Retry Read
Continued transient
error
No
Idempotent? Yes
Yes
Read Available?
No
✔
✗
? No
Yes
Create
✔
✔
Success
Assured Create completion time?
?
?
No Yes Deferred
Read
Exists
Continued Not exist
✔
Exists
Not exist
Start
Assumptions: • Synchronous transport (e.g. HTTP) • Unreliable medium (e.g. internet) • Transient error = request successfully sent, but reply not received
Permanent error
Question: Can we trust the client to absorb user retries?
? ✗
Data Consistency: Consistency vs. data loss/duplication
• Is data consistency equally important across “updates” “inserts” and “deletes” in this business scenario?
• What is the cost of data loss? The business value of the transaction? The damage to reputation? The internal costs of tidying up corrupt data?
• What is the probability of the data loss? If loss only occurs on server outage and you have 99.9% availability…
• Will you discover the loss through another route anyway. e.g. the customer will ring up to ask why the goods haven’t arrived. Is that ok, if the probability is sufficiently low?
Designing for perfect error handling and transactionality to ensure zero data loss can be expensive and complex. What is the return on that investment? Could we live with less than perfect transactional boundaries in some cases?
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API
1. Architectural end to end view 2. Performance 3. Data consistency 4. Granularity 5. Security
• Summary
27
Granularity examples
Order
Order Item
Order Order Item
x 20 items
x 10 items
Order Item
x 10 items
Resource expansion Pagination
Order Item
Order Item
Order Item
Fine grained Course grained
Isolated resources
Order Item
x 20 items
Order Item
Granularity: Where to do composition: Single Datastore
Data store
Device or browser app
API Gateway
Application Server
Data store
Device or browser app
API Gateway
Application Server
Data store
Device or browser app
API Gateway
Application Server
Data store
Device or browser app
API Gateway
Application Server
Aggregation Point Consumer app Gateway Internal application Within datastore
Strengths UI responsiveness Good for cached data
API simplifications Reusable at API
API simplification Transactional updates Reusable at all levels
API simplification Indexed joins
Transactional updates Reusable at all levels
Weaknesses Client complexity Sequential latency
Cant reuse composition Inefficient on joins
Poorer performance than from lower layers. Distributes application
logic.
Requires application code change
Requires datastore code change
Acceptable for Reads Isolated writes (e.g. read,
read, write)
Reads Isolated writes (e.g. read, read, write)
Reads Combined writes
Reads Combined writes
Granularity: Where to do composition: Multiple Datastores
30
Device or browser app
API Gateway
App Serv.
Aggregation Point Consumer app Gateway Integration Hub Integration Hub
Strengths UI responsiveness Rapid innovation
API simplification Reusable at API
API Simplication Re-usable at all levels
API Simplification Swifter response
Weaknesses Sequential latency Authentication
Distributes application logic.
Distributes application logic.
Delayed synchronisation
Offline error handling
Acceptable for Reads Isolated writes (e.g. read, read, write)
Reads Isolated writes (e.g. read, read, write)
Reads Isolated writes (e.g. read, read, write)
Combined writes, (but holding intermediate state is controversial)
Asynchronous chained writes
Data
App Serv.
Data
Device or browser app
API Gateway
App Serv.
Data
App Serv.
Data
Device or browser app
App Serv.
Integration Hub (sync)
Data
App Serv.
Data
Device or browser app
App Serv.
Data
App Serv.
Data
API Gateway API Gateway
Integration Hub (async)
Integration Hub
Integration Hub
Just a selection of the different patterns of orchestration
Integrated workflow
Stateful integration
Aggregation Isolated
transaction
Composition
Exceptions only
Process
Stateless* engine Stateful* engine
* This refers to persistence of orchestration state
Real time retrieval of data from
multiple systems
Real time updates to multiple systems
that can be combined into a single update
Straight through processing across
systems that can’t be combined
transactionally
People based exception handling
Processes integrating people
and systems
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API
1. Architectural end to end view 2. Performance 3. Data consistency 4. Granularity 5. Security
• Summary
32
He knows where you were last summer…
33
http://vartree.blogspot.co.uk/2014/04/i-know-where-you-were-last-summer.html
http://vartree.blogspot.co.uk/2014/04/i-know-where-you-were-last-summer.html?m=1
Apparently innocent data, can quickly become personal!
Identity crisis
API Gateway
Off-line Process
Browser
How does the web page app get past the same origin policy such that it can call the API? JSONP? CORS?
Who’s holding the phone?
How can we limit the access of the third party once the action is complete? Should we revoke, or just audit?
What identity should we use if manual intervention is required following an offline error?
Who’s actually sitting in front of the browser?
How do we avoid storing user/pwd on the device yet still gain access to APIs? OAuth 1/2.0, Open ID/OpenID Connect.
How do we enable third parties to perform actions on behalf of someone else?
How do we provide access when the user is no longer present?
Security: Authentication, identity, and authorisation In front and behind the API Gateway
• Can the enterprise systems reach external identity providers?
• How will an identity provided by for example an OpenID provider be reconciled with users in an internal registry such as LDAP?
• Where will users’ roles and groups be stored, and how will they be managed?
35
API Gateway
Operational System
LDAP
RACF
SAML WS-Security
JAAS
SPNEGO
Kerberos
OpenID 2.0
OpenID
OAuth 1.0
OAuth 2.0
HTTP Basic Auth OpenID Connect
SCIM
General points relating to API security
• HTTPS is recommended, perhaps mandatory for public facing APIs • Depending where you terminate the HTTPS, you may loose some
of the benefits of your caching
• Request throttling isn’t just about API accounting. • Important tool relating to DoS attacks and general API misuse.
• Payload validation at the gateway is critical • Payload threats can leak through into back end systems.
• Regional issues may exist relating to the data exposed • What are the legal implications of the availability of data and
functions in other countries. Does it depend on who’s accessing it?
36
Agenda
• Introduction • The public face of an API • What’s the substance beneath a good web API • Summary
37
API Implementation Related Products • IBM API Management (on-prem /saas) • Datapower • Bluemix • Cache (eXtreme Scale, Bluemix cache, DataPower cache) • IBM Integration Bus • WAS Liberty profile • IBM Business Process Management
38
Extended reference architecture showing some relevant IBM products
Service Exposure (internal)
Operational Systems (Applications & Data)
Integration
Consumers (internal)
Business Process Orchestration
IBM Business Process
Manager
IBM Integration Bus
In-house
Service Exposure (external)
Consumers (external)
IBM API Management
IBM DataPower
(IBM DataPower/API Mgmt?) WebSphere
Service Registry & Repository
Summary
Consumer facing issues are critical to an API’s uptake, but an API’s longevity relies on the underlying implementation being robust, performant, and secure.
• Things we didn’t have time to talk about! • Versioning • Availability • Dependencies • Locking • Pagination • Search/query • More…
40
Thank You
Notices and Disclaimers Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.
Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right.
• IBM, the IBM logo, ibm.com, Bluemix, Blueworks Live, CICS, Clearcase, DOORS®, Enterprise Document Management System™, Global Business Services ®, Global Technology Services ®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, SoDA, SPSS, StoredIQ, Tivoli®, Trusteer®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.