Top Banner
25

Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Dec 24, 2015

Download

Documents

Alexina Hoover
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
Page 2: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Architecting for Scale in SharePoint 2010Russ HoubergSenior Technical Architect, MCMKnowledgeLake, Inc.

Page 3: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Storage ArchitectureSQL Tuning TidbitsRemote Blob Storage (Demo)Performance and Control Scalable Taxonomy Design (Demo)Search… A Complete StoryThe Big Picture: 10 million, 100 million

A BILLION Documents…

Scaling SP2010 from the Ground Up

Page 4: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Storage Architecture can make or break SharePoint Performance• Poor storage performance can tank the whole SharePoint

farm!

Can Be Tough to Estimate• Use an extendable storage platform if possible

Wider is Better• More spindles always better than higher GB• Avoid using a small number of large disks for increasing

storage capacity

Storage Architecture

Page 5: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

TempDB, Search DBs, Content DBs• Multiple Data Files in Primary File Group• # Files = ½ to ¼ of CPU Cores | <= CPU Cores• Separate to unique spindle sets if possible

• Pre-Allocate all Data Files, Including TempDB• Estimate Projected DB Size and Divide by # Files to get the pre-

allocation size for each file

• Leave “AutoGrow” enabled, but don’t rely on it• Pre-Allocation to prevent AutoGrow• Set AutoGrow to 10% or logical MB/GB value based on projected

database Size

Storage Architecture

Page 6: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Data / Log File Spindle Priority

Storage Architecture

Priority DB File RAID IOPS Optimization

1 TempDB Data RAID 10 2 IOPS/GB Write

2 TempDB Log RAID 10 2 IOPS/GB Write

3 Content/DB Log RAID 10 2 IOPS/GB Write

4 Crawl DB Log RAID 10 2 IOPS/GB Write

5 Crawl DB Data RAID 10 2 IOPS/GB Read/Write

6 Property DB Log RAID 10 2 IOPS/GB Write

7 Property DB Data RAID 10 2 IOPS/GB Read/Write

8 Services DB Log RAID 10 2 IOPS/GB Write

9 Services DB Data [Depends] [Depends] [Depends]

10 Content DB Data (Collab) RAID 10 0.75 IOPS/GB Read / Write

11 Content DB Data (Archive) RAID 5 0.75 IOPS/GB Read

Page 7: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

SQL Instant Initialization• Run SQL As Domain User with either…• Local Admin • Grant “Perform Volume Maintenance Tasks”

TempDB Pre-Allocation to 10% Largest DBSAN vs DAS vs NAS (Don’t Overshare!)Host Bus Adapter (HBA) ConfigurationNTFS Allocation Unit Size: 64KEnable Locked Pages in Memory (SQL Std.)Don’t skimp on RAM!

SQL Tuning Tidbits

Page 8: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Remote BLOB Storage (RBS)• By default SharePoint stores Binary Large Objects (BLOBs) in

the content database

• When enabled… Intercepts binary content (documents) and sends them to a BLOB store

• Microsoft provides the “local” FILESTREAM provider to allow for usage of the SQL Server local NTFS file system as a BLOB store.

RBS Background

Page 9: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Remote BLOB StorageWhat’s this ECM thing?- Interesting workarounds• API access was problematic

SharePoint 2003

SP1 Brings us EBS Provider- BLOBs are orphaned during edit/save- Orphan cleanup is resource intensive- Externalization happens on the WFE (reduced RPS)- Future support of EBS API is not guaranteed

SharePoint 2007

Long Live RBS- Transactional consistency supports “VETO”- Transactional consistency allows for UPDATE- Orphan cleanup uses SQL Indexes- Transparent to the SharePoint API- RBS is the best option for future support

SharePoint 2010

Page 10: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Remote BLOB StorageSharePoint WFE

SharePoint Object Model

BLOB StoreProvider Library

BlobStore

SQL Server

ContentDB

ConfigDB

2. Enforce Business

Logic

RBS Client Library Relational Access

1. Save Request

3. Save Blob

4. Write Blob

5. Return BLOB ID

6. Save Metadata & BLOB ID

7. Back to User

Page 11: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

SQL Server 2008 R2• Any Version, even SQL Express R2

FILESTREAM RBS Provider (Current Version)• http://go.microsoft.com/fwlink/?LinkId=177388

RBS Requirements

Page 12: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

The FILESTREAM provider is supported by SharePoint Server 2010 only when it is used with SQL Server 2008 R2 or SQL Server 2008 R2 Express. • Only “local commodity storage” (hard drive) is supported.• Direct Attached Storage (DAS), Network Attached Storage

(NAS), and Storage Area Network (SAN) are all considered to be “remote commodity storage” and are not supported by SharePoint 2010.

Any other 3rd Party RBS Provider is considered to be a “remote server” provider and SharePoint 2010 licensing requires that SQL Server 2008 R2 Enterprise Edition be implemented.

RBS Licensing and Limitations

Page 13: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Remote BLOB Storage

demo…

Page 14: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Performance and Control- Column Indexes were not possible- Database Indexes were not supported

SharePoint 2003

- Column Indexes (10) could be configured via the UI- End users could impact performance with poor performing list views

SharePoint 2007

- Database optimizations allow far more items in a list- Support for (20) Multi-Column Indexes- Resource intensive operations can be limited or disallowed during production hours• Large query thresholds• Blocking Operations• Can be overridden via the Object Model• Can configure an unblocked “window”

SharePoint 2010

Page 15: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

SP2010 Boundaries – Now More Stuff!!!• 30 Million Documents/Items in a List• 5000 Item View/Query Result Size (Default for a reason)• 100 Million Items in SharePoint Server 2010 Search• 1 BILLION Items in FAST For SharePoint 2010 Index• 250,000 Site Collections per Web Application• 200GB Content DB Size (SOFT LIMIT)• Recommend for Collaboration content or Fast Backup/Restore SLA• Content DB sizes up to 1TB are SUPPORTED for large single-site

repositories and archives of non-collaborative content!• That’s 150 Million items in a single Site Collection in a single Content

Database with RBS enabled (avg. 7KB metadata row)

Scalable Taxonomy Design

Page 16: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Enabling 100 Million• Place large Collaboration Site Collections (20GB+) in their

own content database• Break Up Archive/Records Site Collections by Year or, if

necessary, Content Type and Year• AVOID Item Level ACLs!!!• Release to Metadata Based Folder Structures as a workaround

• Use Content Type Syndication to facilitate multiple Site Collections of the same type

• Use Content Organizer as a “Drop Zone”

Scalable Taxonomy Design

Page 17: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Content Organization

demo…

Page 18: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Search… A Complete Story- WSS CAML Only- SPS Shared Services yielded decent full text results

SharePoint 2003

- WSS 3.0 SiteDataQuery allowed search across lists/sites- MOSS Search added Managed Properties - FAST ESP for SharePoint was a late player

SharePoint 2007

- Microsoft SharePoint Foundation Search- Site Collection Scope | No Redundancy | 10 Million

- Microsoft Search Server Express 2010- Extended Features| No Redundancy | 10 Million

- Microsoft SharePoint 2010 Search / Search Server- Extended Features | Scale Out | Redundancy | 100 Million

- Microsoft FAST Search Server 2010 for SharePoint - Extreme Scale | Redundancy | Doc Processing Pipeline- 1 Billion documents! (per farm)

SharePoint 2010

Page 19: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

SharePoint Server 2010 / Search Server• Multiple Crawl Servers (Scale Out/Redundancy)• Crawl Servers comprised of stateless Crawlers• Multiple Crawlers improve crawl performance• Multiple Crawl DBs support more Crawlers• Crawl DB is separated from Property DB• Index is comprised of multiple Index Partitions that can be

mirrored on different Query Servers• Multiple Index Partitions improve Query Performance

Search… A Complete Story

Page 20: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Cool… What can it do?

Search… A Complete Story

Page 21: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

FAST Search Server 2010 for SharePoint• Extreme Scale and Performance• Custom Relevancy and Navigation Tuning• Tune Performance for content volume, query volume, crawl

pipeline performance and query speed• Uses SharePoint 2010 Query Servers• Bolt on FAST Servers for additional processing• Add server ROWS for query performance and high availability

or COLUMNS for crawl performance• Can scale to support 1 Billion items!

Search… A Complete Story

Page 22: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

10 million, 100 million, 1 Billion

Page 23: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

Storage is the KEY to PerformanceRBS reduces Content DB Size and facilitates large repositoriesSharePoint governs end-user operations Content Type Publishing and Content Organization help balance database loadingSearch solutions now handle the entire range of corpus possibilities10 million is easy, 100 million can be done, 1 BILLION is possible!

In Review…

Page 24: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.

http://www.houberg.net

@rhouberg

http://www.knowledgelake.com/resources

More…

Page 25: Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.