Content Management Platforms: the ne xt ge neration of Enterprise Content Management T he Evolu tion of ECM: Pl atfo r m Orient ed, Flexible, Architected for the Cloud and Designed for Technologists This work is licensed und er a Creative Commons Attribution-ShareAlike 3.0 Unported License.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
This document is intended for technology leaders and influencers who are involved in the
selection of solutions for managing enterprise content. The document provides readers a detailed
understanding of how organizational needs for managing enterprise content are evolving, and
why it is critical to implement processes and tools that are sufficiently flexible to support these
rapid changes – now.
• After reading this white paper, you should understand:
• What is meant by the terms “enterprise content” and “enterprise content management”.
• How enterprise content and organizational needs for managing that content are evolving.
• Why it is important to have tools based on a strong technology platform that support
information and processes of increasingly diverse types, complexity and size.
• Technologies and standards for supporting enterprise content management (ECM) in a process-
centric manner.
• How to create the business case for adopting a platform for building content-driven solutions.
Target Audience
The ideal reader is in a software, solution or enterprise architecture role and makes or influences
decisions about development frameworks, enterprise content management systems or other
content-centric platforms. The reader should have a general understanding of enterprise content
management concepts, but is not expected to have detailed understanding of any specificenterprise content management process, application, framework or platform.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
that help make access, delivery and management of information more controlled, efficient and
less costly. However, this list of features is evolving as ECM evolves – constantly adding new
requirements and growing more demanding, with a greater emphasis on integration and long
term flexibility.
Just like Enterprise Resource Planning (ERP) increases operational efficiency andcompetitiveness, standardizing processes like financial management, ECM allows organizations
to gain control over their content to accomplish organizational objectives.
What Is Enterprise Content?
The definition above provides a description of what it means to manage content, but what is the
“enterprise content” that is being managed? Enterprise content has evolved. It is no longer just
digitized versions of scanned documents or a narrowly-defined set of records. Enterprise contentmay include any type of content that an organization captures and uses in its daily processes,
from structured content in relational databases, XML documents or enterprise applications such
as customer relationship management (CRM), supply chain management (SCM) or enterprise
resource planning (ERP) tools, to unstructured content such as text, emails, word processing and
spreadsheets.
Enterprise content is not limited to these items, however. Enterprise content may also include
multimedia such as images, video, voice mail, streaming media and newer forms of information
like geo-data that previously did not exist or occurred infrequently. Social media may also be
expanding its impact on enterprise content. However, at this point, the medium is used more
frequently for communication, collaboration and Web Engagement than for ECM use cases. In
short, enterprise content can be any piece of data, document, enterprise application content or
multimedia asset that is associated with an organizational business process, or any content that an
organization deems valuable enough to store and manage.
Enterprise content is at the heart of information systems – an important part of the processes
and models of the business. Enterprise content is no longer a static entity that exists beside
business logic; content co-exists with business logic. It is critical that platforms support content
types and metadata that are capable of accurately representing the complex relationships and
transactions that occur every day in the business to enable improvements in organizational
process.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
reuse across the enterprise and lack support for applying critical business rules, lifecycle
management or workflow to support content-centric business processes. Without the ability to
classify the content or to represent its relationships to other content in a uniform manner, content
becomes almost impossible to manage as the volume grows.
File sharing applications have another important limitation. As enterprise content becomes morediverse and includes multimedia assets such as video and audio, there is an increasing need to
support rich content and activities such as streaming – a feature the majority of file sharing tools
do not support. Older document management platforms also lack support for many of the newer
content types.
Finally, organizations with legal and regulatory constraints should be very careful about exposing
content using document-sharing solutions, since they do not provide high levels of security,
traceability or control. Even if legal requirements are not in place, exposing content perceived as
private can be a large blemish on the face of an organization.
The Evolution of ECM
Like all business processes and the technology that supports them, ECM is frequently changing to
introduce new models, concepts and meet new challenges. Traditionally, ECM technologies have
consisted of a number of independent solutions:
Figure 3: Traditional ECM functions
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
increasing diversity of content managed by applications, from core business content to content
more related to social collaboration.
Figure 5: How well is content managed (Miles, 2011)
Smart Content
In addition to content growing more diverse, in some cases it is becoming smarter. What does this
mean? Traditionally, content managed by ECM solutions consisted largely of files and scannedimages perhaps interpreted with simplistic optical character recognition (OCR) and very limited
metadata. Today however, modern ECM platforms must be more sophisticated. They must
support interpretation of the actual contents of the document and assign meaning to the data it
contains . This “smart content” allows organizations to go beyond just storing a binary file with
limited metadata for viewing and instead associate relationships, complex metadata and business
rules to automate business processes. For example, an organization scans an invoice. ECM
technology can extract the invoice number and use it to relate the invoice to information in other
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
• Customer Segmentation: As technology improves, organizations are better able to develop
unique customer profiles to engage them more specifically. For example, they can consolidate
information from CRM systems with unstructured emails and social media interactions to
create a more complete understanding of customer needs across all channels.
• Automated Decision Making: Access to content can provide all of the information necessary toautomate decisions previously made by humans, reducing operational costs. Even if companies
don’t automate decisions completely, content can be used to facilitate decision making. In
claims processing, for example, instead of going to multiple systems, enterprise content
management could allow staff to see all relevant documents from a single interface, improving
the efficiency of the process.
• Identifying New Business Models, Products and Services: Aggregated content and its analysis
presents innumerable new business opportunities from real time price comparison services to
preventative care solutions in the health care sector.
Although the additional data is valuable, organizations are still struggling with their attempts tomanage, store and derive value from the content. In many cases, as content sizes grow, so does
the complexity and cost associated with managing the content.
Legacy techniques of managing information are no longer sufficient with growing volumes of
content. Users simply can’t browse categories, or in some cases search, the massive amounts of
content stored in the enterprise – it’s too overwhelming. Growing content sizes make adoption of
new content processes and technologies essential. From cloud-based storage to semantic
technologies that improve the amount machines are able to assist users – content growth is
transforming the entire ECM space.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Enabling Content-Driven Processes and Applications
Some organizations have made significant advances in leveraging ECM tools. However, in most
cases, use of the technology is still not optimal. Why? Many still consider ECM a stand-alone tool
instead of a middleware component that provides services capable of supporting an extremely
diverse set of business functions from invoicing to case management.
For example, a human resources department may need a technology solution to assist inexecuting the new hire process. This is a content centric process. A resume, a potential employee
profile and interview feedback are all potential content types. The scenario may be even more
complex. There may be a requirement to automatically provision an accepted candidate in the
user directory, or integrate with a web content management system or other business processes.
No “boxed” ECM application is likely to meet these requirements perfectly. This is a content-
driven application. Instead of custom developing a solution using a lower level framework (e.g.
Hibernate for persistence) the organization could benefit from the capabilities of a content
platform for managing the process.
Figure 8: Architecture Pattern for Content Driven Applications
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Although electing to adopt a content platform versus a traditional ECM application has a
number of benefits, some architects may question how to best integrate the components into
their technology portfolio. The diagram above illustrates one possible architectural model for
organizing content-driven applications using content platform components.
In this model, the ECM platform exposes services such as workflow, authentication and accesscontrol and auditing via one or more interfaces that can be used to build custom content
applications or integrate with existing packaged and enterprise applications. In addition, the
content platform also provides direct access to stored content via standards-based interfaces
(repository services). The content platform can be used to build content-enabled applications that
leverage either the content directly or the set of high-level services provided by the content
platform.
In this model, a user interface layer offers frameworks to expose its services to different user
interface technologies. These frameworks can be leveraged to create custom interfaces from the
platform that adapt to the organizational context, making user adoption smoother and reducing
the user learning curve.
A platform should be architected in a modular and flexible manner. It should expose an entire
framework for use by developers, and not simply a content repository. Many ECM tools tend to
be architected around the content repository only, providing no additional layers. This is a more
traditional approach, inherited from the client-server era.
A platform should also provide a comprehensive set of services, from low-level directory services
to high level user interface services. This is the design of a platform that supports integration. It is
designed to be extended and assembled by developers making solutions from the platform.
Whether this model is implemented or an alternate ECM design strategy, a componentized,platform centric approach is the key to delivering truly flexible ECM. ECM evaluations and
adoption discussions should move away from specific applications and functional
implementations that assume requirements and toward one of content frameworks that can be
easily customized and extended at the repository, platform and user interface level to work in the
manner that is most appropriate to meet organizational needs.
Providing Modularity and Extensibility
No vendor can anticipate every use case that must be supported for managing enterprise content.
New content types, standards and business models are constantly being developed; therefore, it is
important to select an ECM platform that is architected for interoperability, customization and
extension – not something all vendors support. Lack of extensibility can have a direct impact on
business capabilities. Over half of businesses indicate that lack of ECM tool support for their
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
• A range of different web user interface frameworks for different interaction use cases (e.g. JSF,
GWT, basic web templating, etc.) to ensure all kinds of web applications, including mobile and
tablet based user interfaces, can be served in the most appropriate manner.
• RIA (Rich Internet Applications) frameworks supporting Flex and related middleware such as
Adobe Life Cycle or Granite DS, to ease the development of these interfaces.
• “Silent Applications” for scripting and batch processing purposes.
• Desktop applications
• Thick mobile applications, supporting major technology such as Android and iOS, and
providing dedicated SDKs to wrap the platform services and APIs.
Figure 10: Multi-Channel Content Delivery
Running Anywhere, Including the Cloud
Traditional enterprise software was designed to run on premise, and still today, a large share of
ECM solutions are running on premise, managed by internal IT operations. In this scenario, it is
important to leverage standard middleware. For example, in terms of Java technology, it should
be possible to install the entire solution on a standard Java Application Servers such as the lean Apache Tomcat servlet engine, JBoss application servers or other standard infrastructure without
requesting additional specialized components to meet the requirements of the ECM or its
supporting components. The same considerations are important regarding the underlying
database. “Release where you want and how you want” should be the motto. This enables
companies to achieve better ROI by allowing them to share and leverage their existing
investments in a single uniform stack.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Traditional solutions and platforms often fail in this area. Many require specific setups, specific
systems and in the end, almost dedicated maintenance and operational processes that result in
additional costs for organizations.
Beyond traditional on-premise environments, cloud computing has brought a large range of
opportunities and promises. Cloud computing has democratized technology for manyorganizations. Instead of hiring specialized staff and making large infrastructure and software
investments, organizations can obtain the same capabilities for a low start-up fee and a monthly
or usage-based subscription fee.
Cloud-based platforms allow architects to design sophisticated, reliable, and highly available
enterprise content solutions without concern for:
• Installation dependencies
• Computing storage capacities
• Upgrade paths
• Software configurations
• Hardware investments
• Future scalability
that could constrain architecture and design decisions. Cloud computing is usually segmented
into the following layers:
• Infrastructure-as-a-Service (IaaS): IaaS is the lowest level of abstraction in the cloud
technology stack. IaaS provides operating system support, storage and processing. Vendors inthis sector include Windows Azure and Amazon EC2.
• Platform-as-a-Service (PaaS): PaaS is essentially the middleware of the cloud. It is more
abstract than the IaaS layer and provides components, an environment and frameworks for
building higher-level applications. Vendors in this space include Heroku and CloudBees.
• Software-as-a-Service (SaaS): SaaS is usually the highest level of the cloud stack and
includes complete application solutions designed to be leveraged by end users such as web mail
and Google Apps.
Forrester Research illustrates the cloud taxonomy as:
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Although SaaS is currently the most prolific use for cloud computing, PaaS adoption is growing;
it is estimated that the market will grow to 11.91 billions $ in the next decade (), driven by
organizations seeking to reduce technology costs while simultaneously improving services.
ECM technology, like almost every other technology, has been impacted by cloud computing.
However, it should not be assumed that ECM solutions can seamlessly transition to the cloud.
Many ECM tools have a number of characteristics that make utilization in a cloud difficult, if not impossible. For example, many ECM vendors have built their solutions through acquisition or
independent product development cycles that don’t share a common architecture and have not
(and may never) standardize environmental requirements, resulting in a tool with a dizzying
number of external dependencies that cannot be supported in most standardized cloud
environments.
The first thoughts of ECM in the cloud may point to SaaS. While this is a valid option, the
reality is that an ECM platform must enable use of all layers of the cloud stack:
• IaaS: Allow organizations that are already taking advantage of hosted infrastructure to
continue to use it without requiring specialized environment configurations that make cloudhosting impractical.
• PaaS: Leverage the fantastic promises of the approach, as it’s all about delivering frameworks
and components that can be customized, but abstracting from the complexity of the lower
infrastructure level – a very good match for the modern ECM platform, as depicted before.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
• SaaS: A significant portion of users of the ECM platform will deliver their applications in this
manner, asking for all the technical requirements that it implies:
• Elastic resource allocation
• Multi-tenancy
• Security and privacy
• Monitoring of large scale implementations.
Beyond allowing all three approaches to cloud-computing, a good technical platform should
make it easy to move from one mode to another, and also to switch between deployment options
without significant effort, high cost or lengthy time to market.
When considering a new ECM technology, it is important to consider more than just a “supports
the cloud” check-box on an RFP. Cloud support is not a simple YES/NO question; cloud
requirements and capabilities vary and should be examined in detail. It is not sufficient to rely on
the shiny “Cloud based” marketing collateral.
Modern Development: Agile and Soon in the Cloud
The new requirements of ECM mean that “out of the box” solutions are no longer sufficient.
Content-driven applications have a level of uniqueness that requires most organizations to set up
a development team to configure, develop, maintain and deploy the solution. While the
complexity and cost of this might be a concern, using modern software methodologies, tooling and a well-designed software framework can dramatically minimize the delivery effort.
A set of best practices can ensure higher quality solutions with a lower cost of implementation
than traditional software delivery approaches:
• Adopt Agile and iterative development practices. Embrace the “release early, release
often” approach. These practices have been shown to reduce delivery risk compared to
traditional predictive project management practices.
• Implement continuous integration. Building at every change ensures that changes in the
code base that have a negative impact are identified early, before they can cause additional
issues.
• Implement automatic testing. This reduces the time, effort and cost of testing and ensures
that a full regression suite is always available to confirm the validity of software changes.
• Use modern tools and techniques for source control (e.g. Git) that provide developers
with more efficiency than traditional tools that lock entire files while a single developer makes
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Development-as-a-service is still in its earliest stages; a lot of evolutions must occur before the full
development cycle is supported in the cloud. However, there is already real value in adopting this
new approach toward development, when you can combine and integrate a cloud-based
development environment with an on-premise development infrastructure, such as the
continuous integration chain and the deployment process.
Standards Matter, but Don't Be Blind
Why Standards?
Non-technical users really don’t care about which standards exist and which are emerging. They
care that platforms play well together; they want interoperability. They want solutions that can
communicate with each other without excessive effort and cost. They want to be able to move
content between platforms if a new tool is selected. It could be said that standards in the ECMspace are more about avoiding content lock-in than vendor lock-in.
Standards provide a set of guidelines and mechanisms for interacting with a technology.
Adoption of standards has a number of benefits, with the most frequently cited being
interoperability. No organization wants to be tied to a single vendor or product option for
implementing a technology solution – no matter how well the vendor’s solution functions or the
vendor provides service. Standards adoption has a number of additional benefits such as:
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
• Increased development consistency, simplicity and predictability
• Improved code reuse
• Reduced cost, time and effort to transition between vendors and solutions
• Reduced focus on commodity and infrastructure
• Ability to create composite interfaces that are tailored to the needs of specific job roles –
mashability
• Improved application portability
• Enable faster time to market because it is easier to purchase off the shelf components and
applications that can integrate and provide features for the solution
Organizations should understand which standards provide the benefits that are most important
for their needs when adopting an ECM solution.
Existing and Emerging Standards
They say the good thing about standards is that there are so many to choose. This may be
humorous, but seasoned technologists know that, unfortunately, the quip has some truth – the
world of enterprise content management is no different. There is no single standard that is more
important than all others. There is no universal definition of what is most valuable; it always varies by the unique technical and business needs of the organization.
Not every ECM vendor and product will support every standard. However, it is important to
determine the standards that are most important for future business and technical strategy and
ensure they are supported by the potential ECM platform. For example, an organization
concerned with the publishing industry might have a strong interest in adopting the NewsML
standard, whereas an organization with more generic and horizontal coverage might have more
interest in supporting Content Management Interoperability Standard.
Standards impact a number of areas in the ECM market and it is important that these be
understood.
INTEROPERABILITY
As noted above, interoperability is one of the primary drivers for standards adoption.
Interoperability takes many forms. In ECM, interoperability is primarily targeted at providing a
standardized way for content-based applications to share their content assets.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
The main standards related to interoperability for ECM solutions include Content Management
Interoperability Services (CMIS) and Java Content Repository (JCR). CMIS has grown slightly
more popular than JCR due to the technology agnostic approach taken by the standard.
CMIS is one of the most recent standards in the content management space; it was specifically
designed to support interoperability among ECM solutions. Officially adopted in May 2010,managed by OASIS, and supported by a large number of vendors, the standard defines a vendor
agnostic domain model, a protocol abstraction and a set of bindings that allow the sharing and
accessing content across multiple ECM tools. Key services provided by CMIS include:
• Repository Services: Enable information discovery of the content repository and the object
types defined for the repository
• Navigation Services: Supports navigating the folder hierarchy in a CMIS repository
Metadata augments content stored by ECM solutions with additional details such as taxonomy,
relationships, security attributes, usage characteristics, auditing information, and any number of
additional attributes. How important is metadata to an ECM solution? It is critical. Without
metadata, it becomes almost impossible to manage, maintain control and find content in anECM tool. There are a number of standards that impact metadata creation and management
within ECM solutions such as XML, Dublin Core and semantic technology related standards
(e.g. RDF). Support for some of these standards, like Dublin Core, is important, but not sufficient
for solving all ECM metadata needs. Keep in mind that many standards that address taxonomies
and semantic technologies are still maturing, so adopting a platform with the flexibility to support
the standard in the future will be key.
The most important standard, although it is a much lower level standard than many of the others
discussed in this white paper, is without a doubt XML. The Extensible Markup Language (XML)
is a standard managed by the World Wide Web Consortium (W3C). The human and machine-readable text-based markup language, similar to HTML, is now familiar to most technologists.
Unlike HTML, XML does not have a single defined set of tags and attributes; it allows adopters
to define their own elements or utilize a vocabulary defined by another party. XML is a core
technology for defining structured content and data, and of course, metadata; it is the foundation
for a number of other standards like Dublin Core and XMP.
XML has been such a core technology that almost all vendors will promise support. However,
like with computing, architects must examine what “support” means. Not all vendors fully
support XML equally for integration and transformation, storage and publishing. Architects
should explore in detail the XML capabilities of an ECM platform when it comes to managing,
storing and processing XML-based data.
Another domain that is mentioned more frequently related to metadata is semantic technology.
Semantic technology allows association of meaning or context to digital content – not just
meaning for people – but for computers as well. If computers can learn the meaning behind
content, they can learn what users are interested in and provide assistance with common tasks,
such as search or augmenting data with existing details based on known relationships. Without
semantic technology, content is typically just links between structured and unstructured resources.
Semantic technology provides context to these resources and their relationships so that machines
can recognize entities such as people, places, events, organizations, etc. within the content.
Support for semantic technologies is limited in the majority of ECM platforms, although some
forward thinking vendors are beginning to incorporate the technology. If semantic technology
lives up to its promises, the enhancements it provides for metadata, categorization and content
enrichment will substantially improve ECM technology. This can be seen in research and open
source projects like the Interactive Knowledge Stack (IKS) project. IKS is an European Union-
funded research project involving vendors like Nuxeo and Adobe, focused on building an open
and flexible technology platform for semantically enhanced content management. IKS has
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
OpenSocial is comprised of two high-level concepts: gadgets and APIs. Gadgets are small,
pluggable, HTML/JavaScript based components with a basic lifecycle that run in containers
responsible for providing the gadget with the rendering environment and JavaScript APIs. The
core OpenSocial APIs provide capabilities for managing people, activities and data and are
exposed via JavaScript and REST. OpenSocial gadgets can also be used to provide a simple and
light integration solution between applications, and they can access any piece of information inthe enterprise that is exposed via REST.
In addition to the existing capabilities of OpenSocial, there are efforts underway to provide
tighter integration between OpenSocial and CMIS; the changes are targeted for version 2.0 of
the OpenSocial specification.
There are a number of additional standards not directly related to content that are important for
ECM development, such as OAuth, REST and LDAP. Each of these technologies can play an
important role in solution delivery.
OAuth is an open protocol standard for delegated authentication. It provides a standard way fordevelopers to offer their services via an API without forcing their users to expose their credentials.
From a user perspective, the standard allows a user (resource owner) to grant access to a
protected resource from one application (service provider) to another application (service
consumer). OAuth is a form of delegated authentication, which enables a single identity to be
shared across multiple sites without sharing credentials. In addition to providing a standard way
to grant access between applications, OAuth also provides a mechanism to restrict the scope and
lifetime of a service consumer’s authentication. This is a much more secure strategy than sharing
credentials and granting unlimited access to a third party. It is also convenient for users, who are
freed from creating more login credentials. Prior to OAuth, there were a number of other
proprietary internet authentication protocols. Unlike many of these earlier protocols, OAuthsupports use by non-web based applications.
Given that enterprise content is core to many business processes, it is important that a well-
designed platform provide a standard way to control access to its services. Instead of reinventing
the wheel, vendors like Nuxeo are integrating OAuth in their platforms to control which services
and data are shared between applications.
Lightweight Directory Access Protocol (LDAP) is another protocol standard architects should
consider. The LDAP protocol allows applications to access information stored in an LDAP server.
LDAP servers can store any type of information, but they are most frequently used to store
contact information, security credentials and group information. The majority of organizationsthat support secured access to resources or email store user information in an LDAP directory.
LDAP servers are so common, ECM platforms should support integration, at least at a read level,
with LDAP servers so that user information does not have to be replicated in multiple locations.
Representational State Transfer (REST) is an architectural style based on how the web works, not
a standard for application integration. RESTful interactions involve two components - clients and
servers. Clients make stateless requests to servers; servers receive requests, process them and
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
• Association for Information and Image Management (AIIM), 2011. What Is ECM? Retrieved
06 10, 2011, from AIIM Website: http://www.aiim.org/What-is-ECM-Enterprise-Content-
Management.
• Miles, 2011. State of the ECM Industry 2011. AIIM.
• Manyika, et al., 2011. Big Data: The Next Frontier for Innovation, Competition and
Productivity. McKinsey Global Institute.
• Chute, Manfrediz, Minton, Reinsel, Schlichting, & Toncheva, 2008. An Updated Forecast of
Worldwide Information Growth Through 2011. IDC.
• Ried & Kisker, 2011. Sizing the Cloud. Forrester Research.
• Fermigier, Delprat, Grisel, Guillaume, 2010. Lessons learned developing the Nuxeo EP open source, component-based, ECM platform. Proceedings of the 2010 ICSSEA Conference.
Additional Resources
For additional information on items listed in the white paper, you can review the resources below.
• Oauth: http://www.oauth.net/
• Open Social: http://www.opensocial.org/
• W3C SWEO Linking Open Data: http://www.w3.org/wiki/SweoIG/TaskForces/