Adapting Visual-Analytical Tools for the Exploration of Structural and Dynamical Features of Polymer Conformations a Sidharth Thakur, Melissa A. Pasquinelli* Introduction In simulations of polymer-based nanomaterials, research- ers often observe interesting structural features and higher- order arrangements of polymers. Some of the emergent structures include coils and loops in individual polymer chains [1–3] and alignments and entanglements in com- plexes. [4,5] These structures can be within one chain or among chains of the same or different chemical composi- tion. An exploration of the types of these substructures and their spatial-temporal evolutions as functions of time can provide researchers useful insight about important func- tional and physical properties of the underlying system. The spatio-temporal behavior of polymeric systems can be explored using standard tools such as animations, interactive visualizations, and computational-based ana- lytical methods. Some of these tools are commonly available in many standard existing molecular visualiza- tion applications such as visual molecular dynamics (VMD). [6] However, the standard tools are sometimes not sufficient to conduct a detailed or customized analysis of some of the complicated structures observed in polymeric systems. For example, tools such as animations and interactive explorations are generally limited to providing information about basic spatial-temporal dynamics of a molecular simulation. Another challenge is that many existing sophisticated analytical methods are designed for specific domains; e.g., computational methods developed to analyze spatial structures of proteins are sometimes not directly applicable to polymeric systems. In addition, no existing tools can easily extract and quantify features that could relate to bulk phenomena such as entanglements and persistent substructures that may suggest crystal nuclea- tion or unique characteristics in interfacial regions. There- fore, there is a strong motivation to develop analytical and Full Paper M. A. Pasquinelli Fiber and Polymer Science/TECS, North Carolina State University, Raleigh, North Carolina 27695, USA E-mail: [email protected]S. Thakur Renaissance Computing Institute, Chapel Hill, North Carolina 27517, USA a Supporting information for this article is available at Wiley Online Library or from the author. Conformational analysis of macromolecular structures reveals interesting higher-order spatial arrangements. Analyzing these features as a function of time provides insights into the dynamical behavior of these systems and the identification of relevant subdomains. We present some visual-analytic methods that we devised to explore the spatial-temporal properties from molecular dynamics simulation data. These methods automatically detect common features and connect them to properties of interest. These methods yield physical insights that are not easily obtainable with existing methods for particle simulation data, as illustrated for polyacetylene interacting with a carbon nanotube. 286 Macromol. Theory Simul. 2011, 20, 286–298 ß 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim wileyonlinelibrary.com DOI: 10.1002/mats.201000086
13
Embed
Adapting Visual-Analytical Tools for the Exploration of Structural and Dynamical Features of Polymer Conformations
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Full Paper
286
Adapting Visual-Analytical Tools for theExploration of Structural and DynamicalFeatures of Polymer Conformationsa
Sidharth Thakur, Melissa A. Pasquinelli*
Conformational analysis ofmacromolecular structures reveals interesting higher-order spatialarrangements. Analyzing these features as a function of time provides insights into thedynamical behavior of these systems and the identification of relevant subdomains. Wepresent some visual-analytic methods that we devised toexplore the spatial-temporal properties from moleculardynamics simulation data. These methods automaticallydetect common features and connect them to propertiesof interest. These methods yield physical insights thatare not easily obtainable with existing methods forparticle simulation data, as illustrated for polyacetyleneinteracting with a carbon nanotube.
Introduction
In simulations of polymer-based nanomaterials, research-
ers often observe interesting structural features and higher-
order arrangements of polymers. Some of the emergent
structures include coils and loops in individual polymer
chains[1–3] and alignments and entanglements in com-
plexes.[4,5] These structures can be within one chain or
among chains of the same or different chemical composi-
tion. An exploration of the types of these substructures and
their spatial-temporal evolutions as functions of time can
provide researchers useful insight about important func-
tional and physical properties of the underlying system.
M. A. PasquinelliFiber and Polymer Science/TECS, North Carolina State University,Raleigh, North Carolina 27695, USAE-mail: [email protected]. ThakurRenaissance Computing Institute, Chapel Hill, North Carolina27517, USA
a Supporting information for this article is available at Wiley OnlineLibrary or from the author.
The spatio-temporal behavior of polymeric systems can
be explored using standard tools such as animations,
interactive visualizations, and computational-based ana-
lytical methods. Some of these tools are commonly
available in many standard existing molecular visualiza-
tion applications such as visual molecular dynamics
(VMD).[6] However, the standard tools are sometimes not
sufficient to conduct a detailed or customized analysis of
some of the complicated structures observed in polymeric
systems. For example, tools such as animations and
interactive explorations are generally limited to providing
information about basic spatial-temporal dynamics of a
molecular simulation. Another challenge is that many
existing sophisticated analytical methods are designed for
specific domains; e.g., computational methods developed to
analyze spatial structures of proteins are sometimes not
directly applicable to polymeric systems. In addition, no
existing tools can easily extract and quantify features that
could relate to bulk phenomena such as entanglements and
persistent substructures that may suggest crystal nuclea-
tion or unique characteristics in interfacial regions. There-
fore, there is a strong motivation to develop analytical and
library.com DOI: 10.1002/mats.201000086
Adapting Visual-Analytical Tools for the Exploration . . .
www.mts-journal.de
exploratory methods that are customized to focus on the
features of polymeric systems. The customized tools can be
coupled with visualization techniques to facilitate addi-
tional exploratory tasks that cannot be supported by
traditional tools.
The goal of this work is to develop visual-analytical tools
that can be used to compare substructures of polymer
conformations and to explore local attributes of atomic
trajectories obtained from molecular dynamics (MD)
simulations. We have developed a method for exploring
polymer conformations that combines a number of
existing computational techniques. For example, a curve-
matching technique was adapted to compare and extract
polymer substructures. A dimensionality reduction tech-
nique called multidimensional scaling (MDS) was used to
cluster and visualize common substructures of polymer
conformations. Finally, a visualization system was built
that integrates the computational methods and interactive
visualizations so that similarities of polymer conforma-
tions and their substructures can be explored. Although the
focus is on atomistic MD simulation data, these tools are
general enough to be adapted to any particle-based
simulation data that provides spatio-temporal informa-
tion. One of the challenges that we aim to address using the
visual-analytical tools is to relate the spatio-temporal
behavior of important substructures to the macroscopic
properties of polymers and polymer-based nanocompo-
sites, and this work is an initial step toward that goal.
A fundamental approach in this work is based on
explorations of similarity relationships among polymer
conformations. Similarity refers to proximity or measure of
nearness in some space[7] and similarity computations
involve standard metrics such as Euclidean distances
d2RN� �
. Similarity analysis allows us to address some of
the challenges associated with understanding structure–
property relationships of polymeric systems, and serves as a
basis for the development of more complex visual-analytic
tools. For example, similarity analyses and visualizations of
local attributes of polymer conformations and atomic
trajectories are employed to explore spatial and temporal
dynamics of the polymer molecular system. Another
technique is to identify and compare persistent and salientb
features of polymer conformations. These methods can be
employed to study systems involving single or multichain
polymers; to compare sets of related polymers, such as
across a class like polyolefins or polyamides; or to
b In the context of molecular structure analysis, salient molecularstructures are defined as those that differ significantly from othersin a neighborhood; in previous work,[8] saliency was defined usingdifferences of global structures of helices in protein complexesover distinct time steps. Persistent structures are those that arestable over some time interval during a molecular dynamicssimulation.
www.MaterialsViews.com
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
investigate the evolution of distinct features in interfacial
regions, whether the interface is with another polymer
system or with a nanoparticle.
As a test case for this paper, we have chosen to use two
simulations of a single chain of polyacetylene (PAC), one as
it interacts with a single-walled carbon nanotube (CNT) and
the other as an isolated system. We chose PAC because of its
simplistic structure and also because the rigidity along the
backbone due to the conjugated p system provides some
conformational limitations. In addition, recent studies by
Tallury and Pasquinelli[9] investigated the interactions of a
single chain of PAC with a single-walled CNT, and reported
that PAC forms a distinct folding-like structure along the
CNT surface, where intrachain interactions and crossings
are maximized. These observations for PAC (which can also
be observed in Figure 4 in this paper) were in contrast to
other rigid backbone polymers in the study that contained
aromatic groups in the backbone, which formed more
helical structures to maximize thep–p interactions with the
CNT surface. These chain folds for PAC were observed lead to
the polymer ‘‘coating’’ the CNT surface rather than fully
wrapping around its diameter. Therefore, PAC serves as an
interesting system to use as a case study for testing the
visual-analytic tools discussed in this report.
Generation of Data
The datasets used in this work were generated from
atomistic MD simulations; specific details can be found in
previous work,[9,10] but will be briefly summarized here.
The polymer chain was generated with the graphical user
interface of DL_POLY version 2.19.[11] The polymers were all
built in head-to-tail configuration, and the molecular
structures were first equilibrated with an MD simulation
for 100 steps using a time step of 1 fs. A (10,10) zigzag single-
walled CNT was used with a diameter of 7.7 A and a length
of 125.0 A, which was built with the DL_POLY[11] graphical
user interface with the built-in Bucky-tube module. To
simplify the computational analysis of the simulations, the
atoms in the CNT were frozen spatially. Initial configura-
tions of the CNT and polymer chain were created by
aligning the CNT along the Z-axis, and then the relaxed
polymer chain was placed such that the perpendicular
distance from the CNT was around 40 A, implying that the
polymer chain was well outside of the cutoff radius of
interaction at the initial stage.
Molecular dynamics (MD) simulations were performed
with DL_POLY version 2.19.[11] We used a constant number
of molecules, constant volume, and constant temperature
(NVT) ensemble at 300 K with cubic periodic boundary
conditions and the DREIDING[12] force field. We used a time
step of 1 fs, and all cutoff radii were set to 10.0 A. After a
system equilibration of 0.5 ns, the dynamics of the polymer-
2011, 20, 286–298
H & Co. KGaA, Weinheim287
288
www.mts-journal.de
S. Thakur, M. A. Pasquinelli
CNT system were recorded for 3.0 ns. The data analysis tools
described in this report were performed on this 3 ns
production run.
Exploration of Conformational Similarities
Polymer molecules often form interesting three-dimen-
sional spatial structures such as coils and regular bends or
loops. Knowledge gained from analysis of the polymer
structures can help in understanding how the physical and
chemical properties of polymer molecules induce the
formation of the interesting spatial arrangements, such
as the nucleation of polymer crystallization. In this section,
we discuss a visual-analytical approach used to explore
similarity relationships among polymer conformations
and to expose interesting properties of the MD of polymers.
This approach involves visualization of similarity analysis
and the development of atom-time-value (ATV) plots.
Similarity Analysis and Visualization
To determine similarity relationships among a set of
polymer conformations, global comparisons were per-
formed in earlier work[13,14] that considered the entire
three-dimensional structure of conformations in spatial
and temporal regimes. This goal was accomplished by
adapting a computational technique proposed by Best and
Hege[15] that employs a two-step computational procedure.
First, feature vectors that describe molecular conformations
are generated using numeric measures or indexes that are
based on geometric and statistical properties of the
conformations. A feature vector for a polymer conforma-
tion at the ith time step is an m-tuple array:
di ¼ d1; d1; � � � ;dt� �
t2 0;m½ � (1)
where dt is a scalar value computed using a numeric metric
and the size m of the feature vector is dependent on the
selected metric. Table 1 lists standard metrics that were
explored in the previous work.[13,14]
In the second step in the computational procedure,
similarity scores for all unique pairs of conformations are
obtained based on root mean square deviation (RMSD) error
between corresponding feature vectors. The following
equation represents the computation of the similarity
score between two distinct conformations at time
steps (i, j):
Dij ¼ 1�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1
m
Xmt¼1
dti�dt
j
� �2s
; (2)
t t
where di ; dj are corresponding elements of the feature
vectors for the conformations at the time steps i and j;
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
Di¼j ¼ 1; and m is the size of the vectors. In their earlier
work,[15] Best and Hege applied this method to analyze
protein conformations and employed a two-dimensional
graph-based representation to visualize clusters of similar
conformations.
Correlation matrices were adapted in our earlier
work[13,14] to display similarity relationships among whole
conformations and among trajectories of backbone atoms.
A correlation matrix is a standard technique to compactly
among a large set of entities. In the matrix, the entities are
arranged along rows and columns and the similarity scores
are represented by encoding the matrix cells with a color-
coding scheme. An example of such a correlation matrix is
given in the right hand side of Figure 1. The matrix
representation provides a more detailed overview of the
changes in local and global conformational similarity
relationships than the 2D graphs used by Best and Hege.[15]
Figure 1 depicts how a correlation matrix is constructed to
display pairwise similarity relationships among a set of
conformations through the use of feature vectors derived
from MD trajectory data. To support interactive exploration
of the similarity relationships, the correlation matrix
technique was also supplemented with standard visualiza-
tion tools such as linked matrix-molecular visualization
displays and matrix filtering operations, which is described
in more detail in our previous work.[13,14]
These visualizations of correlations matrices are useful
for displaying global structural changes in entire polymer
conformations and for displaying relationships among
trajectories of backbone atoms. However, it is challenging
to apply the matrix-based approach to display changes in
local attributes associated with substructures of polymer
conformations, such as those that may be relevant to
interfacial properties or crystal nucleation. To address this
limitation, we next describe an approach that extends the
matrix visualization approach to the visualization of local
properties of atomic trajectories.
Visualization of Local Properties
Visualization of local properties simultaneously in spatial
and temporal domains is obtained by plotting some scalar
property in a matrix-based visualization relative to two
dimensions, namely (i) sequential order of atomic indexes
(arranged vertically in the plot), and (ii) time steps (arranged
horizontally). We refer to these plots as Atom-Time-Value
(ATV) plots. Figure 2 illustrates an example of an ATV plot
and depicts its relation to trajectories of backbone atoms
and time. Normalized scalar values are represented using a
bi-variate color-coding scheme and the values are typically
aggregated into a few (five or six) intervals; the bi-variate
color coding scheme helps to expose emergent patterns
corresponding to similarities among backbone atoms.
2011, 20, 286–298
H & Co. KGaA, Weinheim www.MaterialsViews.com
Table 1. Numeric metrics used for comparing polymer conformations. More details of these metrics can be found in ref. [13,14]
Metric Description
Interatomic distances Based on interatomic distances and is defined as follows:
d ¼ dijðxÞ ¼ j~xi�~xjj� �
; i; j21; � � � ;N ð3Þwhere the~x1;~x2; � � � ;~xN�1 are three-dimensional positions of N atoms or centroids of
groups of atoms on polymer backbone. Utilizing interatomic distances as a
metric[37] is useful because it is invariant under rotation and translation.
Rotational moment of inertia Based on the rotational moment of inertia of a polymer that is computed relative to
a region in space. This metric can provide information relative to another molecular
species. For example, this quantity relative to the longitudinal axis of a carbon
nanotube (CNT) can provide information on behavior such as polymer wrapping
and thus help quantify the interfacial interactions.[9,10] The corresponding feature
vector is computed as follows:
dRMI ¼ dðiÞ ¼ mðiÞRrefðiÞ2n o
; i21; � � � ;N ð4Þ
where mðiÞ is the atomic mass of the ith atom and RrefðiÞ is the perpendicular
distance of the atom relative to the region of reference, such as the longitudinal axis
of a CNT.
Radius of gyration Based on the average squared distance from the center of gravity. This metric can be
useful in exploring the clustering behavior of a polymer. Instead of averaging the
distances as is typically done to obtain a scalar quantity, the squared distance
(RgðiÞ2) of each atom in the polymer from the center of gravity of the polymer is
stored in a feature vector, as indicated by the following:
dRoG ¼ dðiÞ ¼ RgðiÞ2n o
; i21; � � � ;N ð5Þ
Bond vectors Based on vectors along bonds. This metric can be used to compare local
orientations in a polymer backbone. This feature vector is computed as follows:
dBV ¼ vðiÞ ¼ ~xiþ1�~xij~xiþ1�~xij
� �n o; i21; � � � ;N ð6Þ
where ~xi is the three-dimensional position of the ith backbone atom and vi is a unit
vector directed along the bond between adjacent atoms. Since each element of the
feature vector is a three-dimensional vector, two feature vectors can be compared
using dot products of the bond vectors at corresponding indices; the dot product
values are summed to obtain a single scalar value that represents the global
similarity of the conformation pair.
Bond-orientational order Bond-orientational order[1] represents the global alignment of a polymer chain with
a given fixed axis. The measure is defined as:
d ¼ dðiÞ ¼ 1N�3
Pn�1
i¼2
3 cos2 ci�12
� �� ; i21; � � � ;N ð7Þ
where Ci is the angle between average vectors between pairs of every other
backbone atoms (called sub-bond vectors) and the z axis.
Adapting Visual-Analytical Tools for the Exploration . . .
www.mts-journal.de
Several types of local quantities or ‘‘measures’’ can be
used to generate the ATV plots. For example, the distances of
backbone atoms from some origin can be mapped in both
spatial and temporal scales, such as from the center of mass
of a CNT in a polymer-based nanocomposite. Other
www.MaterialsViews.com
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
measures include instantaneous velocities of backbone
atoms and relative displacements of backbone atoms with
respect to an initial set of locations (pseudo persistence
length). Depending on the type of measure that is selected,
either a single plot or related plots can be generated.
2011, 20, 286–298
H & Co. KGaA, Weinheim289
Figure 2. Illustration of an ATV (right) and its relation to trajectories of backbone atoms of a polymer chain (left). The ATV plot conveys, in acompact visualization, an overview of changes in some local scalar quantity (e.g., instantaneous velocities) for each atom (verticaldimension) at each time step (horizontal dimension). A colorized version of this figure is also available as Supporting Information.
Figure 1. Illustration of the construction of similarity matrix visualization. A snapshot from two different time steps of theMD trajectory fora polymer–CNT interaction is given on the left. From these time steps, feature vectors are defined and compared with the RMSD error toobtain a scalar similarity score. This value for each time step is then mapped onto a correlation matrix, depicted on the right. A colorizedversion of this figure is also available as Supporting Information.
290
www.mts-journal.de
S. Thakur, M. A. Pasquinelli
Atom-time-value (ATV) plots have been used to inves-
tigate the conformational states of a single chain of PAC
interacting with and without a CNT. Snapshots from the
MD simulations are given in Figure 4 along with projections
of the snapshot in the xy, yz, and xz planes. The ATV plots in
Figure 3 enable the visualization of some of the local and
global spatial changes in the polymer conformations during
these MD simulations. For the simulations with a CNT,
where the CNT is aligned along the Z axis of the global
frame, Figure 3(a) displays an ATV plot of the perpendicular
distances of backbone atoms of PAC from the three
coordinate axes of a global reference frame, which is at
the center of the CNT. A comparable plot for PAC without a
CNT is given in Figure 3(b) and the frame is approximately
at the center of mass of the polymer molecule. Note that due
to the differences in the global frame of reference for the
two scenarios in Figure 3, differences are expected in the
general patterns in the ATV plots. These ATV plots in
Figure 3 provide details about the local structure in the
polymer conformations. Note that the regular patterns such
as the horizontal bands in the plots represent persistent
loops in the corresponding directions. For Figure 3(b) for the
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
dynamics of an isolated PAC molecule, persistent bands are
observed, particularly at later steps of the simulation. No
distinctive difference is observed relative to the three
frames of reference (the X, Y, and Z axes).
A different behavior is observed in the presence of the
CNT in Figure 3(a). Since the frame of reference is defined as
the center of the CNT, the features about the Z axis, which
extends along the longitudinal axis of the CNT, quantifies
how far away the atoms are from the CNT surface, whereas
theXandYdirections correspond to how much the polymer
molecule extends along the longitudinal axis of the CNT. An
abrupt shift in the system patterns is observed in all three
frames of reference at around 1 ns, which corresponds to the
time step where the polymer finds the CNT surface and
quickly adsorbs onto it. The lack of features in theZdirection
after 1 ns indicates that the atoms reached an optimal
minimum distance from the CNT surface, and that this
adsorption is persistent with time. The gray bands in the Z
direction after the 1 ns adsorption phenomenon are likely
due to regions where chain loops are slightly lifted from the
CNT surface and from geometric hindrances due to
intrachain crossings, both of which can be observed in
2011, 20, 286–298
H & Co. KGaA, Weinheim www.MaterialsViews.com
Figure 3. Atom-time-value (ATV) plots for polymer PAC showing perpendicular distances of backbone atoms from axes of referencecoordinate frames for the entire 3 ns of the MD trajectory (broken into 661 steps). The figure contains plots for two different scenarios: (a)interaction of PAC with a CNT, where the CNT lies along the Z axis; and (b) PAC in isolation. The perpendicular distances in each case arenormalized based on the maximum distance in the set of distances from the three axes. The distance values are binned into six equalintervals and are represented by a univariate color scheme. A colorized version of this figure is also available as Supporting Information.
Adapting Visual-Analytical Tools for the Exploration . . .
www.mts-journal.de
the MD snapshot in Figure 4(a). In some instances, these
gray bands disappear with time, suggesting that the system
was able to overcome the barrier to find a lower energetic
minimum in which its interaction with the CNT surface had
increased.
For the X and Y directions in Figure 3(a), persistent bands
are also observed after the 1 ns adsorption time step. Note
that if the molecule preferred to form a tight random coil
rather than an extended structure, there would be less color
variations in the distances relative to theXandYaxis than is
observed in Figure 3(a). Therefore, the range of color bands
indicates that the polymer formed an extended structure, as
observed in Figure 4(a). As discussed by Tallury and
Pasquinelli,[9] the polymer molecule, as an entity, tends
to transverse the longitudinal axis of the CNT as a function
of simulation time. Thus, the deviations (in other words, the
‘‘wavy appearance’’) observed in these persistent bands in
Figure 3(a) as a function of time captures the longitudinal
motion of the molecule relative to the CNT surface, but the
persistence of the bands suggests that the global three-
dimensional structure of the polymer stays relatively intact
with this longitudinal motion. Note that with this frame of
reference, it is not possible to quantify the motion of the
www.MaterialsViews.com
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
molecule in the xy plane, which is along the circumference
of the CNT [refer to the inset in Figure 4(a)]. However, this
motion could be quantified with the use of a spherical
coordinate system.
The ATV plots convey in a single, compact visualization
the spatial-temporal distributions in local scalar quantities.
To improve the usefulness of these plots, some standard
visualization features and enhancements have been
employed, such as the following:
(i) L
2011
H &
inking and brushing: in computer visualization,
linking and brushing are standard interactive data
exploration techniques that allow an observer to
examine one or more data points in multiple,
coordinated data visualizations (linking) and to inter-
actively select a set of data points for comparison
(brushing). Linking and brushing can be used to explore
relationships among atomic trajectories based on the
exploration of patterns of similarities in the ATV
plots.Figure 5 illustrates an example of linking and
brushing; in the example, a set of atoms and time steps
are selected interactively on the ATV plot and
corresponding atom trajectories are visualized using
, 20, 286–298
Co. KGaA, Weinheim291
Figure 4. Molecular dynamics (MD) snapshots of low energy states of PAC (a) with a single CNT, where the CNT is drawn using a stylizedrepresentation; and (b) without a CNT. Projections of the conformations in three two-dimensional orthogonal planes are shown on the rightin each figure. Each snapshot is taken at an interval of 5 ps from theMD trajectory. The two polymers are drawn at differing zoom levels dueto the large size of the CNT. A movie of the corresponding full trajectory of (a) is given in ref.[9] and on the Pasquinelli lab website (http://www.te.ncsu.edu/mpasquinelli). A colorized version of this figure is also available as Supporting Information.
Fssst
292
www.mts-journal.de
S. Thakur, M. A. Pasquinelli
distinct colors in a molecular display window. This
simple visualization approach can allow exploration of
interesting patterns on the ATV plot and correspond-
ing spatial-temporal dynamics of polymer backbone
atoms. The atomic subset may also be colored by time
step, which can be useful for exploring the evolution of
the structure as a function of time.
(ii) P
lot of moving averages: alternatively, moving
averages of the geometric properties of the backbone
atoms can be plotted over a small interval to display
igure 5. An example exhibiting the linking and brushing operations betwimulation trajectory of PAC interacting with a CNT. A region on the ATV plelect a set of ten atoms and around five hundred fifty frames of the MD sets are shown as the green curves in the display on the right. The conformhe onset of wrapping of the polymer around the CNT. A colorized versi
Macromol. Theory Simul. 2011
� 2011 WILEY-VCH Verlag GmbH &
the most important spatio-temporal changes; this
operation also reduces the noise due to small local
perturbations in the conformational structures.
These matrix-based approaches have limitations. For
example, a matrix visualization can reveal mostly global
and some local information about structural changes in
polymers during MD simulations. Another constraint is
that many of the numeric measures based on global
een an ATV plot (left) and a molecular display (right) of an MDot, indicated by the black rectangle, was sketched interactively toimulation. Trajectories corresponding to the selected atom-timeation shown using thick lines corresponds to a time step duringon of this figure is also available as Supporting Information.
, 20, 286–298
Co. KGaA, Weinheim www.MaterialsViews.com
Adapting Visual-Analytical Tools for the Exploration . . .
www.mts-journal.de
geometric properties are sensitive to the length of the
polymer chains and are therefore not suitable for identify-
ing macroscopic effects that are dependent on molecular
weight or for comparing conformations among different
polymer types. Therefore, the exploration of substructures
is supplemented with local structural comparisons to find
common substructures among a set of conformations. A
local structural comparisons can be used to identify salient
and persistent substructures of conformations of a single
polymer during a particle-based simulation or among a
family of related polymers (e.g., nylons with the same
overall chemistry but varying sizes of aliphatic portions in
the monomers).
Development of Substructure Analysis andVisualization Tools
Polymer conformations often contain a variety of local and
global structural features and three-dimensional arrange-
ments. Identification and analysis of these structures reveal
relevant information about properties of the polymer
systems. For example, identification of persistent sub-
structures in conformations corresponding to low energy or
relaxed states in a family of related polymers (e.g., nylons)
enables changes in spatial properties to be compared due to
changing the chemistry of the polymer, and thus could
enable experimentalists to better tune the desired proper-
ties of the material. Structural analysis provides insights
into the spatio-temporal dynamics of the interfacial regime
of nanocomposites.
In this section, we describe our method, which consists of
the following main steps: (i) construct a proxy curve that
represents the backbone of a polymer conformation;
(ii) build feature vectors that uniquely describe curve
segments on the proxy curve; (iii) compare the feature
vectors to generate similarity relationships among a set of
curve segments; and (iv) connect this substructure analysis
to visualization tools. Since some exploratory and analy-
tical methods have been developed by others to assist in the
exploration of structures of macromolecules, such as those
developed for chain-like macromolecules like proteins, the
applicable work will first be summarized in context, and
then we will give specifics on our approach.
Patro et al.[8] proposed a general framework to detect
frames that correspond to salient conformations of proteins
among a set of frames from MD simulations. The approach
involves construction of an affinity matrix that contains
relative orientations of amino acid units on a protein
backbone. The authors applied this approach to extract
salient conformations of a helices and to identify major
structural changes in protein complexes that form ion
channels at boundaries of biological cells. A number of other
researchers have used geometric information of macro-
www.MaterialsViews.com
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
molecules for similarity comparisons. A standard approach
is to model backbones of chain-like molecules such as
proteins using parametric spline curves. For example, Kim
and Singh[16] have described an approach for comparing
protein structures that is based on extraction and analysis
of the geometry of protein backbones. Similarly, Can and
Wang[17] have developed a method in which portions of
curves representing molecular backbones are aligned based
on local shape and curvature information. Other research-
ers have developed techniques that utilize a variety of local
and global geometric information as metrics to compute
‘‘distances’’ among matched segments or whole conforma-
tions of chain-like molecules. Some applications of this
metric-based approach include clustering protein confor-
mations using interatomic distances;[15] clustering a set of
proteins based on sequence alignment and geometric curve
fitting;[18] a comparison based on angles between bonds
and angles between tangents;[19–21] a comparison using
attributes of protein secondary structures;[22,23] and a
combination of shape, structural alignment, superposition,
and biochemical attributes of residues.[21,24,25]
Construction of Proxy Curves
A standard approach to describe the three-dimensional
geometry of chain-like molecular structures such as
polymer conformations is to represent the backbone of
the molecular structure using a proxy parametric curve
such as Uniform B-Splines.[26] The rationale for adapting
this approach is that a parametric spline curve allows
sampling at regular intervals, which is useful for a curve-
matching technique described later in this section.
To approximate a molecular backbone using a B-Spline
proxy curve, control points need to be specified on the
backbone. There are two choices for selecting control points:
(i) coordinates of backbone atoms, and (ii) a simplified
model of the geometry of the backbone. Both approaches for
selecting control points result in proxy curves that are
approximations of the geometry of the original backbone;
however, the latter approach based on simplification allows
control over the amount of details of the backbone that can
be retained in the proxy curve. The ability to vary the details
is sometimes useful to eliminate local ‘‘noise’’ in molecular
geometry (for example, zig-zag directions in a small
neighborhood), which can be used as a smoothing operator
prior to a following curve matching step.
We adapt a curve simplification that is based on standard
geometric techniques for simplifying a three-dimensional
curve.[27,28] The level of simplification is controlled by a
scalar parameter, the error, that is based on the differences
in length of the original curve and the simplified
representation. Previously, Agarwal and et al.[29] used such
an approach to compare the three-dimensional spatial
structures of proteins. Figure 6 gives examples of spline
2011, 20, 286–298
H & Co. KGaA, Weinheim293
Figure 6. Illustration of the parametric spline curves constructed to represent the polymer backbone curve. (a) The original backbone of thePAC polymer from anMD simulation, (b) standard spline constructed based on backbone atoms, (c) a simplifiedmodel of the polymer chain,and (d) a spline derived from the simplified model of the polymer in (a). A colorized version of this figure is also available as SupportingInformation.
294
www.mts-journal.de
S. Thakur, M. A. Pasquinelli
curves generated for a conformation of the PAC polymer
from a single time step of an MD simulation. Note that
although simplification generates a smoother spline with
some loss of information, major features of the conforma-
tion are still preserved, such as loop regimes.
Another consideration is that the approach we use to
build invariant descriptions of curve segments (discussed
later in this section) requires a curve to be sampled at equal
intervals of arc length. In the standard construction of a B-
Spline, the parametric space is sampled at regular intervals
and basis functions are interpolated to generate a
continuous curve. However, sampling regularly in the
parametric space does not guarantee equal lengths
between points on the generated proxy curve. Therefore,
a technique called arc-length parameterization has been
adapted to re-sample the spline curve to obtain points on
the curve that have some fixed, equal distance between
them. A numerical reparameterization method is
employed that is available in a geometry processing
toolkit called GeometricTools.[26] Typical values of arc
lengths used in this work were in the range l2 0:01; 0:03½ �,where higher values or arc-length result in sparse
sampling of a proxy spline.
Description of Molecular Conformations
There are a number of standard geometric properties that
can be used to obtain unique descriptions or signatures of
the three-dimensional geometry of the proxy curves.
Examples of the geometric properties are coordinates of
vertices on the curve, tangents, curvature, and torsion.
Additionally, domain specific information may also be
included to describe proxy curves associated with mole-
cular backbones; some common examples are bond
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
directions, dihedral angles, and attributes of residues
(e.g., the types of amino acid units in proteins).
We employ a general curve description technique called
similarity invariant coordinate system (SICS)[30] to build
feature vectors that uniquely describe curve segments on
polymer conformations. We chose the SICS-based approach
because it is relatively straightforward to implement and
provides feature descriptions that are invariant under
transformations such as rotations and translations.
Figure 7 illustrates the definition and construction of a local
reference frame for curve segments using the SICS method. In
the SICS method, a description of three-dimensional curve
segments is obtained by re-defining the geometric informa-
tion of a segment in terms of a local coordinate reference
frame. This local reference frame provides a useful descrip-
tion of curve segments and one that is invariant under
transformations such as rotation, scaling, and translations
(for details about this method please refer to ref. [30]).
Once a local frame is defined for a given curve segment, a
number of geometric quantities is used to define a feature
vector that uniquely represents the curve segment.
Examples of some standard geometric quantities that can
be used to build feature vectors include three-dimensional
coordinates of points sampled on the curve segment, three-
dimensional coordinates of tangents at sample points,
angles between sample points and the mid point of the line
joining the terminal points of the segment, and angles
between the tangents at sample points and the local frame.
Some other geometric information that can be used include
curve curvature, inter-point distances, ratio of distances
between points, and higher-order derivatives.
A particular feature of the SICS method is that it is
designed to perform scale invariant matching. However,
since the primary interest is matching substructures at
2011, 20, 286–298
H & Co. KGaA, Weinheim www.MaterialsViews.com
Figure 7. Illustration of the construction of a local reference frame for defining feature vectors for curve segments using the SICSmethod.[30]
A colorized version of this figure is also available as Supporting Information.
Adapting Visual-Analytical Tools for the Exploration . . .
www.mts-journal.de
comparable scales, we provide an option to override scale
invariance; scale invariance is ignored by including inter-
point distances because relationships among a set of inter-
point distances are uniquely defined for a given curve
segment. Therefore, given a curve segment s¼ <ui>having
k points at equal arc length, a feature vector can be defined
as follows:
fvec
TabFig
Com
dxy
du
txyz
tf
k, t
<d
arc
www.M
¼ dxyz;du; txyz; tf; k; t; dij �
;arc len u1; � � � ;uNð Þ
juN�u1j
� ;
(8)
where the various components of the feature vector are
defined in Table 2. During experimentation, we found that
including curvature and torsion information did not
greatly alter the matching results; we therefore excluded
them from the feature vectors to reduce the number of
calculations.
Finally, to use feature vectors to compare segments of
proxy curves, we have adopted a simple scheme to uniquely
describe and identify each curve segment in a set. According
to the scheme, a segment is defined as follows:
seg ¼ timeid; startPos; endPos; fvecð Þ; (9)
le 2. Components of feature vectors described by Equation (8). Theure 7.
ponent Description
z Three coordinates of the point ui on a
Three angles between the vector joini
and the origin of the local reference f
Three coordinates of the tangent at p
Three angles between the tangent an
Normalized values of curvature and t
ij> A set of k k�1ð Þ=2 inter-point distance
len u1; � � � ;uNð ÞjuN�u1j
Ratio of the length of the segment an
aterialsViews.com
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
where timeid is time stamp of the frame in an MD
simulation in which the segment occurs; startPos and
endPos are indexes of terminal points of the segment on a
regularly-sampled proxy spline curve; and fvec is a feature
vector based on Equation (8) and computed using points on
the curve segment.
Matching of Substructures
Algorithm 1 outlines the main steps of an iterative
approach to perform substructure matching of data from
MD simulation trajectories. In order to perform the curve
matching, we divide the arc-length parameterized
spline curve representing a polymer conformation into
multiple segments. Segment definitions such as segment
size and relative offsets of adjacent segments are set by the
user prior to curve matching. The segments are compared
with one another using the RMSD error between corre-
sponding feature vectors. These comparisons are performed
for all possible unique pairs of curve segments. The RMSD
comparison for each pair of feature vector produces a single
scalar value,d2 0; 1½ �, that represents the similarity between
components are associatedwith a curve segment that is illustrated in
segment of a piecewise linear curve
ng the midpoint of line segment v¼uN – u1
rame
oint ui
d the basis vectors of the reference frame e1; e2; e3h iorsion of the curve at ui
s between the k points on the segment
d the length of the line joining its terminals, i.e., juN�u1j
2011, 20, 286–298
H & Co. KGaA, Weinheim295
296
www.mts-journal.de
S. Thakur, M. A. Pasquinelli
indicate closely matching curve seg-
ments). An adjustable range slider is used
to set the maximum and minimum value
of RMSD to control the set of resulting
matches. Finally, we use a simple seg-
ment-matching procedure to compile sets
of similar segments. Each of these sets
represents a unique feature or substruc-
ture of the parameterized splines.
Performance optimization can be used
to reduce the number of comparisons;
some of these techniques include hash-
ing, binning and noise reduction, and also
by using chemical and biological align-
ment.[17,24,25] Another approach to match
substructures is described by Li,[30] which
is based on comparing relationships
among substructures using a graph-
based approach. In on-going work, we
are focusing on improving performance
of our substructure matching approach based on some of
these optimization techniques mentioned.
The methodology of matching similar segments can be
used to perform substructure matching and comparisons.
The structural comparisons include:
(i) I
dentification of salient features and substructures:
determine the most commonly occurring features of
polymer conformations in molecular simulations or
among a set of related polymers.
(ii) E
xploration of specific features: explore the spatial
and temporal distributions of specific interesting
features on polymer conformations. A specific feature
is selected in one time step on a conformation and is
searched on all conformations and over all time steps.
This information is used to explore spatial arrange-
ments such as alignment of multiple segments. An
example is the spatial-temporal alignment of sub-
structures during polymer crystallization.
(iii) C
ompare global structures of related polymers:
determine similarity relationships among related
polymers or among conformations of a polymer
under different scenarios, such as polymer conforma-
tions with and without nanoparticles (e.g., CNTs)
present. The comparison of related systems can
provide information on the types of substructures
that can persist near interfaces when polymers
interact with other systems, such as nanoparticles.
Substructure Visualization
Interactive visualization techniques are utilized to display
and explore spatial and temporal distributions of salient
substructures of polymer conformations. An example is
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
represented in Figure 8. In this example, an interactive
histogram is provided to support fast exploration of all of
the sets of common substructures found by the structure
matching system. The bars in the histogram represent the
different types of matching substructures found by the
system, and the height of the bars corresponds to the
number of substructures of each type. Moreover, the
histogram is linked to the molecular visualization screen
to show the matching substructures for each set. The spatial
distribution of the substructures in a single set can be
visualized by selecting the corresponding histogram bar
(shown highlighted in yellow). All segments corresponding
to the selected set of matching substructures are visualized
as curve segments; the segment of the molecular chain in
the currently selected time step is indicated as a thick pink
curve. Furthermore, by hovering over the bars in the
histogram, a user can quickly inspect various substructures
found by the system. A visualization of the temporal
distribution of the substructures in a set is provided by a
timeline (refer to the bottom of Figure 8) and markers above
the timeline to indicate the occurrence of a matched
structure at corresponding time steps. Note that the
timeline is over the entire MD trajectory, which in this
case is 3 ns but has been parsed into 661 steps in this
visualization.
Another option for visualizing relationships among
matched features is to use standard graph layout techni-
ques to display similarity relationships among the features.
One example of such an approach is MDS,[7,31,32] which is a
technique to display proximity relationships among a large
number of items. MDS is a general dimensionality
reduction technique that produces an embedding of
high-dimensional data points (i.e., points in RN) in a low
dimensional space Rd;d << N� �
such that inter-point
2011, 20, 286–298
H & Co. KGaA, Weinheim www.MaterialsViews.com
Figure 8. A snapshot of our visualization system which illustrates a set of matched features of polymer conformations (left) that wasgenerated using our substructure matching methodology. Each substructure in this example is an entire conformation of the PAC polymer.Two other panels in the visualization provide supplementary information using a histogram (right-top) of the number and distribution ofthe unique features of the polymer and using a layout (right-bottom) based onMDS to display similarity relationships among all features. IntheMDS display, each dot is a single feature and dots close to one another represent similar features. The dots are color coded based on timestamp of the frame in which the corresponding feature occurs. A colorized version of this figure is also available as Supporting Information.
Adapting Visual-Analytical Tools for the Exploration . . .
www.mts-journal.de
distance relationships are close to original distances among
the points in high dimensions. A stress function is used to
calculate error or lack of fit between dissimilarities in high-
Multidimensional scaling (MDS) has been adapted to
visualize proximity relationships among clusters of similar
features of molecules. However, the standard MDS method
has quadratic complexity and is therefore usually not
suitable for exploring large data sets. Consequently, many
methods have been developed that combine MDS with
other feature-learning methods[33] or use modified forms of
MDS.[34] In a recent work, Rajan et al.[35] described a general,
non-metric MDS approach to cluster trajectories in protein
folding. Their method is able to discriminate small
differences between shapes of protein trajectories.
In our initial experiments, we have explored MDS to
generate visualizations of similarity relationships among
substructures of polymer conformations. Since the current
data sets used in our work are relatively small (typically
having up to to a few thousand features depending on a
segment’s size set by a user) we chose to use a standard
aterialsViews.com
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
implementation of MDS. An example application of this
technique is given in a plot in the bottom-right of Figure 8.
The procedure used in our work to visualize relationships
among polymer substructures is as follows: first, we
generate an N�N similarity matrix that specifies pairwise
similarity scores for all unique features of polymer
conformations. The comparison of polymer features is
based on the SICS approach described earlier in this section.
Next, a fast iterative MDS algorithm[36] is used to compute a
two-dimensional layout of points, where each point
represents a polymer substructure.
The MDS graph in Figure 8 for PAC suggests four distinct
groupings as a function of simulation time. The final
timesteps (dots in group A in the figure) indicate less
dissimilarity than the rest, suggesting the equilibrium
structure remains relatively unchanged; the difference in
the MDS points suggest some local structural fluctuations
within this equilibrium structure. The time frame prior to
that (dots in group B in the figure) have a bit more
dissimilarity than the final time steps, but overall the global
structure maintains some structural order. The conforma-
tions of PAC that have the most dissimilarity are those from
early-to-mid timesteps, indicated at the bottom of the
graph (group C in the figure). In this time window, the PAC
polymer finds the CNT and begins to adsorb onto the
surface, thus large structural deviations are expected as the
conformation changes from relatively linear/coiled within
itself to coiled and folded along the CNT surface. Early time
2011, 20, 286–298
H & Co. KGaA, Weinheim297
298
www.mts-journal.de
S. Thakur, M. A. Pasquinelli
steps (dark gray, middle—group D in the figure) also show
some dissimilarity as the system is searching for energe-
tically-favorable conformations during the equilibration,
but not as much as for the adsorption process. For flexible
polymers with much more degrees of freedom with respect
to the polymer backbone,[10] the MDS plot is expected to
exhibit more dissimilarity with respect to polymer–CNT
interactions.
Conclusion
We adapted visual-analytic methods to develop a tool for
analyzing MD trajectories of polymer systems. This tool
combines computational methods such as feature match-
ing and MDS to determine persistent substructures of
polymer conformations. In addition, standard visualization
techniques such as interactive ATV plots and linked
displays were implemented to support exploratory analysis
of spatial-temporal relationships among matched features.
Although this method used atomistic MD simulations as a
test case, this tool can be generally applied to any particle-
based simulation method. The use of this tool to look at
some interesting characteristics of polymer simulations is
currently being done and will be the subject of future
reports.
We plan to extend this work to develop methods for
investigating multi-chain polymer systems that provide
their own level of complexity but connect to some
experimentally-relevant information such as the degree
of entanglements and crystal nucleation. Connecting
molecular details to macroscopic properties is also a goal.
Acknowledgements: The authors thank Syamal Tallury forproviding the datasets and for insightful discussions. Thiswork was funded by the Renaissance Computing Institute(www.renci.org).
Received: November 12, 2010; Revised: February 12, 2011;Published online: April 5, 2011; DOI: 10.1002/mats.201000086
[1] H. Yang, Y. Chen, Y. Liu, W. S. Cai, Z. S. Li, J. Chem. Phys. 2007,127, 094902.
[2] H. Yang, S. Parthasarathy, D. Ucar, Alg. Mol. Biol. 2007, 2, 3.[3] Q. Zheng, Q. Xue, K. Yan, L. Hao, Q. Li, X. Gao, J. Phys. Chem. C
2007, 111, 4628.[4] T. He, R. S. Porter, Macromol. Theory Simul. 1992, 1, 119.
Macromol. Theory Simul.
� 2011 WILEY-VCH Verlag Gmb
[5] W. Brostow, M. Drewniak, N. N. Medvedev, Macromol. TheorySimul. 1995, 4, 745.
[6] W. Humphrey, A. Dalke, K. Schulten, J. Mol. Graphics 1996, 14,33.
[7] T. F. Cox, M. A. A. Cox, Multidimensional Scaling, 2nd editionChapman and Hall, New York, NY 2001.
[8] R. Patro, Y. Kim, C. Y. Ip, A. Anishkin, S. Sukharev, D. P. O’ Leary,A. Varshney, Scientific Visualization: Advanced Concepts,Schloss Dagstuhl–Leibniz Center for Informatics, Dagstuhl,Germany 2010.
[9] S. S. Tallury, M. A. Pasquinelli, J. Phys. Chem. B 2010, 114, 9349.[10] S. S. Tallury, M. A. Pasquinelli, J. Phys. Chem. B 2010, 114, 4122.[11] W. Smith, Mol. Simul. 2006, 32, 933.[12] S. L. Mayo, B. D. Olafson, W. A. Goddard, J. Phys. Chem. 1990,
94, 8897.[13] S. Thakur, S. Tallury, M. Pasquinelli, T.-M. Rhyne, in: Visual-
ization of the Molecular Dynamics of Polymers and CarbonNanotubes, Springer-Verlag, Berlin, Heidelberg 2009, pp. 129–139.
[14] S. Thakur, S. Tallury, M. Pasquinelli, T.-M. Rhyne, Explorationof Polymer Conformational Similarities in Polymer-CarbonNanotube Interfaces, IEEE Computer Society, 2010, pp. 320–323.
[15] C. Best, H.-C. Hege, Comput. Sci. Eng. 2002, 4, 68.[16] J. Kim, R. Singh, Bioinformatics Research and Applications,
Vol. 6053, Springer, Berlin, Heidelberg 2010, pp. 77–88.[17] T. Can, Y.-F. Wang, Comput. Syst. Bioinf. Conf., Int. IEEE Com-
put. Soc. 2003, 169.[18] G. Vriend, C. Sander, Proteins: Struct., Funct., Genetics 1991, 11,
52.[19] H. Matsuda, F. Taniguchi, A. Hashimoto, Proc. Pacific Symp.
Biocomputing 1997, 2, 280.[20] K. Kedem, L. Chew, R. Elber, Proteins: Struct., Funct., Genetics
1999, 37, 554.[21] L. P. Chew, K. Kedem, Algorithmica 2003, 38, 115.[22] H. Sugeta, T. Miyazawa, Biopolymers 1967, 5, 673.[23] P. Enkhbayar, S. Damdinsuren, M. Osaki, N. Matsushima,
Comput. Biol. Chem. 2008, 32, 307.[24] M. Coatney, S. Parthasarathy, Knowledge Inf. Syst. 2005, 7,
202.[25] S. A. Aghili, D. Agrawal, A. E. Abbadi, Database Systems for
Advanced Applications, Vol. 3453, Springer, Berlin, Heidelberg2005, pp. 993–993.
[26] P. J. Schneider, D. Eberly, Geometric Tools for ComputerGraphics, Elsevier Science Inc., New York, NY, USA 2002.
[27] D. Douglas, T. Peucker, Can. Cartographer 1973, 10, 112.[28] M. A. Abam, M. de Berg, P. Hachenberger, A. Zarei, Discrete
Comput. Geometry 2010, 497.[29] P. K. Agarwal, S. Har-Peled, N. H. Mustafa, Y. Wang, Algor-
ithmica 2005, 42, 203.[30] S. Z. Li, Pattern Recognit. 1997, 30, 447.[31] A. Buja, D. F. Swayne, M. L. Littman, N. Dean, H. Hofmann,
L. Chen, J. Comput. Graph. Stat. 2008, 17, 444.[32] S. Ingram, T. Munzner, M. Olano, IEEE Trans. Visual. Comput.
Graph. 2009, 15, 249.[33] D. K. Agrafiotis, D. N. Rassokhin, V. S. Lobanov, J. Comput.
Chem. 2001, 22, 488.[34] Y. Cao, T. Jiang, T. Girke, Bioinformatics 2010, 26, 953.[35] A. Rajan, P. L. Freddolino, K. Schulten, PLoS One 2010, 5,
e9890.[36] C. Bentley, M. Ward, Proc. IEEE Symp. Inf. Visual. 1996, 72.[37] G. Maggiora, V. Shanmugasundaram, Methods Mol. Biol.