Metalogix Replicator for SharePoint - WordPress.com · 4 supports SharePoint Foundation 2010, SharePoint Server 2010, OfficeSharePoint Server 2007 and Windows SharePoint Services
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
The release ofMetalogix Replicator version 4 represents a new performance and scalabilitymilestone forSharePointreplicationsolutions.Version4includesnewfeaturesforsupportingmassivelyscalableSharePointreplicationnetworks:
Thisdocumentprovidesadditionaldetailed informationon theReplicatorperformance featureset.Thefirstsectiondescribeseachcomponentof theReplicatorperformance featureset.Thesecondsection includesin-depthdiscussionsofeachfeatureaswellasnewbenchmarktestingperformedattheMicrosoftTechnologyCenter inNewYork,NY and theMetalogixNetwork Test Lab. The final section of the document analysesReplicator from an operational point-of-view.
Microsoft Hyper-V server virtualization technology
65SharePoint2010Windows2008virtualservers
16Windows2008R2hostphysicalservers
128GBofphysicalRAM
1.2TBofphysicaldiskstorage
m e t a l o g i x . c o m W H I T E PA P E R
4
TheReplicator version4performance feature set builds uponmany components that havebeenbuilt intoReplicator over the previous 3major releases and 7 years of development effort. The new or significantlyimprovedversion4featuresaremarkedwithanasterisk.
Bydefault,ReplicatorincludestheabilitytogrouporbatchmultipleReplicationEventsintoasingleReplicationPackageforprocessingandtransfertotheTargetWebAp-plication.ThisConfigurablePackageEventCountdeterminesthemaximumnumberofEventsthatcanbepackagedintoasingleReplicationPackage.ThisenablesReplica-tortobetunedfordifferentlevelsofreal-timereplication,wideareanetworkperfor-mancecharacteristicsandavailableInboundandOutboundEventProcessingmemoryand CPU processing resources.
ConfigurablePackageEvent Processing Duration
SimilartotheConfigurablePackageEventCountfeature,theConfigurablePack-age Event Processing Duration allows control over how many Replication Events are groupedorbatchedtogetherintoasingleReplicationPackagebasedonelapsedOutboundEventProcessingtime.ThissettingenablesReplicatorlocalserverresourcerequirementstobetunedforoptimalperformance.
ReplicationPackageCompression
Replicator supports several forms of software compression in addition to support for andcompatibilitywithvarietyofhardwarenetworkcompressiondevices.Replicatorcan use either software or hardware compression to reduce the amount of Replication Packagedatatransferredoverawideareanetwork.
Metalogix Replicator supports a custom implementation of Microsoft’s RemoteDifferentialCompression(RDC)technologythatMetalogixhasoptimizedtoreducethetotalamountofReplicationPackagedatathatneedstobetransferredbe-tweentheSourceWebApplicationtotheTargetWebApplicationduringreplication.
Bydefault,manycustomersusewebapplication-to-webapplicationreplication – replicating the entire content of each site collection inonewebapplicationtothesecondwebapplication.SelectivestructurereplicationenablestheSharePointAdministratortoselectaspecificsubsetoftheSourceWebApplicationstructuretobereplicatedtotheTargetWebApplication.
Rule-basedItem-levelContent Replication
Rule-baseditem-levelcontentreplicationusestheMetalogixReplicatorRulesEngineto process custom rule sets to determine if an item in a particular list or document libraryshouldbereplicated.
SelectableReplicationEvents
ReplicatorsupportsselectablereplicationofchangesthatoccurinaSharePointWebApplication,SiteCollection,WebSite,ListorDocumentLibrary.Thedifferenttypesofindividual changes are called Replication Events. Replication Events are categorized intohigher-levelEventGroups.ByselectingwhichEventsneedtobereplicatedandwhichEventsdon’tneedtobereplicated,theSelectableReplicationEventsfeatureprovidestheSharePointAdministratorwithfine-grainedcontrolovertheEventsprocessingduringInboundEventProcessing,PackageTransferandOutboundEventProcessing; which in turn help improve overall Replicator performance.
InReplicatorversion4,uponacceptanceofthetransferofanInboundReplicationPackage,ReplicatorimmediatelycachesthemetadataforeachReplicationEventintheReplicatorConfigurationdatabase.ThisimprovestheperformanceoftheReplica-torEnginebyeliminatingtheneedtorepeatedlyaccesstheserializedPackagedata;especially in SharePoint farms that host multiple Replicator Engines.
ThenewBackupModefeatureprovidesamoreefficientmethodofreplicatingalargesitecollectionorhierarchyofwebsites.UsingBackupMode,ReplicatorusestheSharePoint import and export operations to create a single archive of the entire group ofwebsites,andqueuesthereplicationofthisarchiveasasingleReplicationEvent.
ReplicatorEnterpriseEditionenablestheReplicatorEnginetobedeployedonmultiplewebfront-endsineachfarm.Withmultiplewebfront-endshostingReplicatorEngineinafarm,Replicatorprovidesahigheravailabilityandhigherperformance replicationsolution; inaddition tosupporting increasedscalability. Ifoneengine isstopped, the other engines are still operational.
Bydefault,ReplicatorincludestheabilitytogrouporbatchmultipleReplicationEventsintoasingleReplicationPackageforprocessingandtransfertotheTargetWebApplication.ThisConfigurablePackageEventCountdetermines themaximumnumberofEvents that canbepackaged into a singleReplicationPackage. ThisenablesReplicator to be tuned for different levels of real-time replication,wide area network performancecharacteristicsandavailableInboundandOutboundEventProcessingmemoryandCPUprocessingresources.
ThesesettingsenablefinegrainedcontroloverwhenOutboundEventProcessingandPackageTransferwilloccur.This in turnsallows theSharePointAdministrator tobettermanageSharePointserverandwideareanetworkresources.
Figure3.MapFamilyReplicationSchedule
m e t a l o g i x . c o m W H I T E PA P E R
9
SimilartotheConfigurablePackageEventCountfeature,theConfigurablePackageEventProcessingDurationallowscontroloverhowmanyReplicationEventsaregroupedorbatchedtogether intoasingleReplicationPackage based on elapsedOutbound Event Processing time. This setting enables Replicator local serverresourcerequirementstobetunedforoptimalperformance.
When hardware network compression devices, bandwidth optimization appliances, or network acceleratorsolutions are available (such as the Riverbed® Steelhead® Appliance), Replicator can be configured tominimizetheserverresourcesusedforsoftwarecompressionandmaximizetheeffectivenessofthenetworkcompression device or application. A sample deployment is illustrated in Figure 4. Metalogix Replicator and Riverbed®Steelhead®ApplianceCompressedReplicationPackageSolution.
ReplicatorsupportsZIPsoftwarepackagecompressionandacustomimplementationofMicrosoft’sRemoteDifferential Compression (RDC) that is highly optimized for the one-way and bi-directional replication ofSharePoint data.
“Remote Differential Compression (RDC) allows data to be synchronized with a remote sourceusingcompressiontechniquestominimizetheamountofdatasentacrossthenetwork.RDCissuitableforapplicationsthatmovedataacrossawideareanetwork(WAN)wherethedatatransmissioncostsoutweightheCPUcostofsignaturecomputation.RDCcanalsobeusedonfasternetworksiftheamountofdatatobetransferredisrelativelylargeandthechangestothedataaretypicallysmall.”1MorespecificdetailsonRDCcanbefoundinAppendixA–AboutWindowsRemoteDifferentialCompression.
Replicator supports selectable replication of changes that occur in a SharePoint Web Application, SiteCollection,WebSite,ListorDocumentLibrary.ThedifferenttypesofindividualchangesarecalledReplicationEvents. Replication Events are categorized into higher-level Event Groups. By selecting which Events need tobereplicatedandwhichEventsdon’tneedtobereplicated,theSelectableReplicationEventsfeature provides the SharePoint Administrator with fine-grained control over the Events processing duringInboundEventProcessing, Package Transfer andOutboundEventProcessing;which in turn help improveoverall Replicator performance.
InReplicatorversion3,aseparateReplicationPackagewascreatedforeachoutboundReplicationConnection.The Shared Replication Package feature in Replicator version 4 eliminates the time required to create aseparatePackageforeachoutboundReplicationConnection–asinglesharedPackagecontainingthebatchofReplicationEvents iscreated.Dependingon thenumberofEvents in thepackage, thesizeand typeoftheSharePointchange,andthenumberofoutboundReplicationConnections, thiscansignificantly reduceOutboundEventProcessingtimeandresources.
Replication version4.1 also includesnewWebSiteReplicationEvents for replicating SharePoint 2010’ssocialnetworkingfeatures:ratings,commentsandkeywordtags.
Figure 5. Metalogix Replicator Replication Events
m e t a l o g i x . c o m W H I T E PA P E R
PACKAGE DATABASE CACHING
CONFIGURABLE REPLICATION MONITOR UPDATE LEVEL
In Replicator version 4, upon acceptance of the transfer of an Inbound Replication Package, ReplicatorimmediatelycachesthemetadataforeachReplicationEvent in theReplicatorConfigurationdatabase.ThisimprovestheperformanceoftheReplicatorEnginebyeliminatingtheneedtorepeatedlyaccesstheserializedPackagedata;especiallyinSharePointfarmsthathostmultipleReplicatorEngines.
TohelpminimizetheQueuedItemupdatenetworktrafficbetweentheSourceandTargetWebApplications,Replicatorversion4.1supportsanewReplicationConnectionpropertycalled“quietmode”.Itcanbeconfigured through theConfigureReplicationConnectionadministrationpagewhere it is referred toas theConfigurableReplicationMonitorUpdate Level. This new setting has 2 values:Normal andMinimal (quitemode).
BackupMode significantly improves the performance and reliability of a large initial replication: instead ofcreatingandtransferring10,000sofqueueditemsandpackagesinaverylargeSharePointenvironment,onlyafew(large)Packagesneedtobepackagedandtransferred.
WorkingcloselywiththeMTCtechnicalstaff,theMetalogixMassiveScalabilityTeamdeployedandconfiguredtheworld’slargestSharePoint2010DistributedSharePointenvironment,measuredintermsofthenumberof independent SharePoint farms. Eleven large scale physical servers from Dell and HP running Microsoft WindowsServer2008R2wereusedtodeployandconfigure85WindowsServer2008
R2 virtual machines. Microsoft Windows Hyper-V was used for the operating system virtualization solution. The entireconfigurationwasdeployedandmanagedusingMicrosoftWindowsVirtualMachineManager.
IMPROVED INITIAL REPLICATION SUPPORT
MICROSOFT TECHNOLOGY CENTER PERFORMANCE AND SCALABILITY BENCHMARK
Theconfigurationof theMetalogixNetworkTest Lab is depicted in Figure 8.MetalogixNetworkTest Lab:PerformanceBenchmarking.IncontrastwithvirtualizedMicrosoftTechnologyCenterenvironmentusedforthemassivescalabilityandperformancebenchmarking,theMetalogixNetworkTestLabrunsentirelyonphysicalhardware.
OutboundEventProcessing is responsible forcapturingandrecordingReplicationEvents thatoccur in theSourceWebApplication.Forexample,whenausercreatesanewcontactinacontactlistinaSharePointWebsiteoreditsandsavesaMicrosoftWorddocumentthatisstoredinaSharePointdocumentlibrary,eachoftheseis captured and recorded as a Replication Event. Replication Events are captured and stored in a set of SQL ServertablesinaReplicatordatabase.OutboundEventProcessingiscontrolledbyReplicationMapswhichdeterminewhatEventsneedtobecaptured,packagedandtransferredtotheTargetWebApplication.GroupsofReplicationEventsarepackaged into two typesofmessagesorobjects:Queued ItemsandReplicationPackages.
Figure9.MetalogixReplicatorPipeline
The Replicator Pipeline is a detailed representation of how Replicator divides the overall replication process into 3activities:
UNDERSTANDING REPLICATOR PERFORMANCE AND SCALABILITY
METALOGIX REPLICATOR PIPELINE
17
m e t a l o g i x . c o m W H I T E PA P E R
AQueued Item is a unit ofwork to be transferred to a TargetWebApplication for remote execution. TheReplicatorWebServiceonaTargetWebApplication iscalledtopushaQueuedItemfromtheSourceWebApplicationtoaTargetWebApplication.
A Replication Package is a collection of one or more Replication Events plus data about the SharePointinformation that is packaged in a format specific to theReplication Transport being used.When anEventisbeingprocessed,OutboundEventProcessingcalls theSharePointobjectmodel toextract thechangedSharePointinformationfromtheSourceWebApplicationcontentdatabase.
The Package Transfer activity is responsible for the transfer of Queued Items and Replication Packages(Packages)fromtheSourceWebApplicationtotheTargetWebApplication.PackageTransferistheprocessthatsitsbetweenOutboundEventProcessing(ontheSourceWebApplication)andInboundEventProcessing(ontheTargetWebApplications).PackageTransferinvokesReplicatorEnginesoftwarecomponentsonboththeSourceandTargetWebApplications.
Userworkloadisdefinedbytwofactors:usercontentchangerateandusercontentchangesizedistribution.Qualitatively, the user content change rate is simply the rate at which users of a particular SharePoint environment makecontentchanges:creatingnewlistitems,editingandsavingexisting
items, creating or uploading new documents, editing and saving existing documents and, less frequently,
User content change size distribution refers to the size of the content changes and their distribution. Forexample,inadocumentcollaborationsolution,theaveragesizeofaWorddocumentmightbeafewmegabytesandtheaveragesizeofaPowerPointmightbemeasuredin10’sofmegabytes.TheamountofdataneededtorepresentanewSharePointcontactortaskitemwouldbemeasuredin10sor100sofkilobytes.Thedistributionofsizesvaryfromsmall1Kbytedocumentstomediafilesmeasuredin10sofmegabytes.
In Replicator version 4, each Map Family contains a Replication Schedule. A Replication Schedule controls how frequentlyReplicationEventsforaparticularReplicationConnectionareprocessedandpackagedintoQueuedItemsandReplicationPackages.EventProcessingandPackagingcanbescheduledtorun:
Withtheexceptionofimmediatereplication,OutboundEventProcessingperformanceisprimarilydeterminedbytheReplicationSchedule,usercontentsizedistributionandSharePointsystemperformance.Forimmediatereplication,OutboundEventProcessingperformanceisalsoaffectedbytheusercontentchangerate.Fromasystemperformanceperspective,allOutboundEventProcessingoccurslocallyintheSharePointfarmthatishostingaparticularSourceWebApplication.OnthelocalSharePointfarm,serverRAM,diskorganization,LANnetworkconfigurationanddatabaseserverconfigurationarethegreatestdeterminantsofSharePointandReplicator system performance.
When Replicator is configured for immediate replication, Outbound Event Processing (Event capture andprocessing)isperformedinnearrealtime.
OUTBOUND EVENT PROCESSING
19
m e t a l o g i x . c o m W H I T E PA P E R
PackageTransferreferstothepartoftheReplicatorPipelinethatisresponsiblefortransferringQueuedItems(and their correspondingPackages) fromaSourceWebApplication to a TargetWebApplication. PackageTransferperformanceisdeterminedbyanumberofdifferentfactors:
Package Transfer performance be improved using Package compression settings that can be set usingReplicatorCentralAdministrationonaReplicationConnectionbyConnectionbasis.The followingPackagecompressionsettingsaresupportedinReplicator:
For example, satellite links typically have lower bandwidth, high latency and lower reliability compared toterrestriallinks;andexceptionallypoorerperformancecomparedtoafiberringinametropolitanareanetwork(MAN) or LAN.Metalogix’s advanced use of theMicrosoft BITS protocol in the design of theReplicatorPackageTransferprocessallowsReplicatortofunctionextremelywelloverhighperformanceconnectionsaswellasoversloworunreliablesatellite,ship-to-shoreandbattlefieldnetworks.
Subject to theaboveconstraints,Replicator InboundEventProcessing isable tooperateatnear real timeperformance.
INBOUND EVENT PROCESSING
21
m e t a l o g i x . c o m W H I T E PA P E R
Replicator Central Administration provides detailed replication monitoring, replication status and replication eventprocessinghistoryreportsformonitoringtheReplicatorPipeline.ThefollowingtwofiguresareexamplesofthestatisticsavailablefromtheReplicationStatusreportforasampleintranetSourceWebApplicationthatisbeingreplicatedtoanextranetTargetWebApplication.
MetalogixConnect isadesktopapplication thatprovidesvisualization,monitoringandmanagementof thewide area SharePoint environment. It can be used to visualize the status of each farm and its associatedconnections.Connectisavailableasanadd-onfeaturefortheBasicandtheStandardEdition(andisincludedintheEnterpriseEdition).
RDC does not assume that the file data to be synchronized resides in physical files. Therefore, the RDCapplicationisresponsibleforperformingfileI/OonbehalfoftheRDClibrary.
Becauseitistransportindependent,RDCcanbeusedwithRPC,HTTP,orotherdesiredtransportmechanisms.TheRDCapplicationbearstheresponsibilityforchoosingtheappropriatetransportandperforminganyclientor server authentication that is required to support the transport's security model.
After thefilehasbeendivided intochunks, thesignaturegeneratorcomputesastronghashvalue (anMD4hash),calledasignature,foreachchunk.Thesignaturescanbeusedtocomparethecontentsoftwoarbitrarilydifferentversionsofafile.
Theclient initiatestheRDCprotocolbyrequestingthesourcesignature list fromtheserver.Thentheclientcompares each source signature against the signatures in its own seed signature list. If a source signature matchesaseedsignature,theclientalreadyhasthefiledataforthatsignature.Ifasourcesignaturedoesnotappearintheclient'slistofseedsignatures,theclientmustrequestthespecifiedchunk(offiledata)fromtheserver.
Metalogix is the trusted provider of innovative content lifecycle management solutions for Microsoft SharePoint, ExchangeandCloudplatforms.Wedeliverhigh-performancesolutionstoscaleandcost-effectivelymanage,move, store, archive and protect enterprise content. Metalogix provides global support to thousands ofcustomers and strategic partners and is a Microsoft Gold Partner, a managed partner in Microsoft’s High PotentialISVGroupandGSAprovider.MetalogixisaprivatelyheldcompanybackedbyInsightVenturePartnersand Bessemer Venture Partners.