Active Directory Replication Issues and Troubleshooting

Post on 23-Feb-2016

91 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

GOPAS TechEd 2012. Ing. Ondřej Ševeček | GOPAS a.s. | MCM: Directory Services | MVP: Enterprise Security | ondrej@ sevecek.com | www.sevecek.com |. Active Directory Replication Issues and Troubleshooting. Active Directory Replication Issues and Troubleshooting. Network Services. - PowerPoint PPT Presentation

Transcript

ACTIVE DIRECTORY REPLICATION ISSUES AND TROUBLESHOOTING

Ing. Ondřej Ševeček | GOPAS a.s. | MCM: Directory Services | MVP: Enterprise Security |ondrej@sevecek.com | www.sevecek.com |

GOPASTECHED 2012

NETWORK SERVICESActive Directory Replication Issues and Troubleshooting

Central Database

LDAP – Lightweight Directory Access Protocol database query language, similar to SQL TCP/UDP 389, SSL TCP 636 Global Catalog (GC) – TCP/UDP 3268, SSL TCP 3269 D/COM Dynamic TCP – Replication D/COM Dynamic TCP – NSPI

Kerberos UDP/TCP 88

Windows NT 4.0 SAM SMB/CIFS TCP 445 (or NetBIOS)

password resets, SAM queries SMB/DCOM Dynamic TCP

NTLM pass-through Kerberos PAC validation

Design Considerations

Distributed system DCs disconnected for very long times

several months Multimaster replication

with some FSMO roles

Design Considerations

Example: Caribean cruises, DC/IS/Exchange on board with tens of workstations and users, some staff hired during journey. No or bad satelite connectivity only. DCs synced after ship is berthed at main office.

Challenge: Must work independently for long time periods. Different independent cruise-liners/DCs can accomodate changes to user accounts, email addresses, Exchange settings. Cannot afford lost of any one.

Database

Microsoft JET engine JET Blue common with Microsoft Exchange used by DHCP, WINS, COM+, WMI, CA,

CS, RDS Broker %WINDIR%\NTDS\NTDS.DIT

ESENTUTL Opened by LSASS.EXE

Installed servicesLSASS

Security Accounts Manager

TCP 445SMB + Named

Pipes

Kerberos Key Distribution Center

UDP, TCP 88Kerberos

Active Directory Domain Services

UDP, TCP 389LDAP

NTDS.DIT

D/COM Dynamic TCP

Installed services

LSASS

SAM

KDC

NTDS

TCP 445SMB + Named

Pipes

UDP, TCP 88Kerberos

UDP, TCP 389, ...LDAP

NT4.0

NTLM Pass-through

PAC validation

Windows 2000+

LDAP/ADSI ClientNTDS Replication

FIM/DRS API Client

Connect to domain

D/COM Dynamic TCP

Uninstallation

DCPROMO requires working replication connectivity

with other DCs DCPROMO /forceremoval

does not access network at all can run in DS Restore Mode

NTDSUTIL Metadata Cleanup

Connection Connect to server srv2.idtt.local Quit Select operation target List sites Select site 0 List domains in site Select domain 0 List servers in site Select server 0 Quit Remove selected server

Metadata Cleanup

TOPOLOGYActive Directory Replication Issues and Troubleshooting

Knowledge Consistency Checker (KCC)

runs 5 minutes after boot Repl topology update delay (secs)

runs every 15 minutes periodically Repl topology update period (secs)

Intrasite Replication Topology

DC1

DC2

DC4

DC3

Originating Updates and Notifications

DC1

DC2

DC4

DC3

15 sec

3 sec

3 sec

Notification and Replication

DC1 DC2

I have got some changes

Kerberos AuthenticatedDCOM TCP Rando

m

Give me your replica

Kerberos AuthenticatedDCOM TCP Rando

m

Intrasite Replication – 3 Hops max.

DC1 DC

4

DC3DC

5DC6

DC7

DC2

Intersite Replication (no Bridgeheads)

DC1

DC2

DC3

DC5

DC6

DC7DC

4

Intersite Replication (no Bridgeheads)

DC1

DC2

DC3

DC5

DC6

DC7DC

4

15 sec

3 sec

3 sec3 sec

3 secschedule

Intersite Replication with a Bridgehead

DC1

DC2

DC3

DC5

DC6

DC7DC

4

15 sec

3 sec

3 sec3 sec

3 sec

schedule

Intrasite Replication

Uses notifications by default (originating/received) 300/30 sec on Windows 2000 15/3 sec on Windows 2003

Occurs every hour as scheduled nTDSSiteSettings At this frequency KCC detects unavailable partners

HKLM\System\CCS\Services\NTDS\Parameters Replicator notify pause after modify (secs) Replicator notify pause between DSAs (secs)

Intrasite Replication

DC1 DC2

notification

random TCP

downloadchanges

random TCP

15 sec

downloadchanges

random TCP

schedule

Intersite Replication

DC1 DC2

downloadchanges

random TCP

schedule

Intersite Replication

Does not use notifications by default siteLink: options = USE_NOTIFY (1)

Compression used siteLink: options =

DISABLE_COMPRESSION (4) Bridge all site links

Site Link Design

Site Link Design (Better?)

London

Olomouc

Roma

Cyprus

Paris

Berlin

Site Link Design (Worse?)

Olomouc

Roma

Cyprus

Paris

Berlin

London

Static TCP for Replication HKLM\System\CurrentControlSet\Services NTDS\Parameters

TCP/IP Port = DWORD Replication + NSPI

Netlogon\Parameters DCTcpipPort = DWORD LSASS (Pass-through)

NTFRS\Parameters RPC TCP/IP Port Assignment = DWORD

DFSRDIAG StaticRPC /port:xxx /Member:dc1

Urgent Replication (Notification)

Intrasite only intersite also if notification enabled

Do not wait for delay (15/3 sec) In the case of

account lockout password and lockout policy RID FSMO owner change DC password or trust account password

change

Immediate Replication (Notification)

Password changes from DCs to PDC

Regardless of site boundaries PDC downloads only the single user

object all changed attributes but only single

object From DC/PDC further with normal

replication

Example Replication Traffic Atomic replication of a single object with

a one byte attribute change Notification + replication

intersite compressed Overall 7536 B 30 packets ~10 round trips

50 ms round trip means 500 ms transfer time consumption at 120 kbps

Useful data ~80 B

Bridge All Site Links On

Olomouc

London

Prague

ParisRoma

Cyprus

B

B A

site links are transitive

can be disabled on IP transportA

A

A

A

Bridge All Site Links Off

Olomouc

London

Prague

ParisRoma

CyprusA

A

site links are not transitive

Cyprus partition is cut off

A

A

A

B

B

GC Replication

Olomouc

London

Prague

ParisRoma

Cyprus

A

A

A

A

A

one-way:from the source NC into the nearest GC

two-way:GCs between themselves

B

GC

GC

GC

Roma

London

GC Replication

Olomouc

Prague

Paris

Cyprus

A

A

A

A

B

AB

one-way:from the source NC into the nearest GC

two-way:GCs between themselves

GC

Subnetting in AD (Apps)

10.10.x.x / 16

10.10.0.248 / 29

DC1

DC2

DC3 DC4

DC5Exchang

eExchangeExchang

e

Subnetting in AD (Recovery)

10.10.x.x / 16

Recovery Site10.10.0.7 / 32

DC1

DC2

DC3 DC4

DC5

Rebuilding After Failure

Rebuilding After Failure

Inter-site IntersiteFailuresAllowed MaxFailureTimeForIntersiteLink (secs)

Intra-site (immediate neighbors) CriticalLinkFailuresAllowed MaxFailureTimeForCriticalLink

Intra-site (optimalization for non-critical) NonCriticalLinkFailuresAllowed MaxFailureTimeForNonCriticalLink

MODIFICATIONSActive Directory Replication Issues and Troubleshooting

Modification operations

Create new object Modify attributes

change/delete value change distinguishedName = rename

Rename container all subobjects renamed as well

Replication Metadata

REPADMIN /ShowObjMeta all attributes when originating DC

Replication conflicts

The later action wins if no one is later then random (USN)

Attribute modified on two DCs “simultaneously” only one change wins

Linked multivalue attribute modified merged (on 2003+ forest level)

Object/container deleted and object modified deleted

Object moved into a deleted container CN=lost and found

Two objects with the same sAMAccountName, cn or userPrincipalName created object renamed, logins duplicit

Linked Multi-values

DC1

Replication

Kamil 10:00Helen 11:00

DC2

DC1 9:00

11:05

DC1

Replication Basics

Kamil 10:00Helen 11:00

DC2

DC1 11:30Kamil 10:00Helen 11:00

11:30

DC1

Replication Basics

Kamil 10:00Helen 11:00

DC2

DC1 11:30Kamil 10:00Helen 11:00

Judith 12:00

12:05

DC1

Replication Basics

Kamil 10:00Helen 11:00

DC2

DC1 12:30Kamil 10:00Helen 11:00

Judith 12:00 Judith 12:00

12:30

DC1

Replication Basics

Kamil 10:00Helen 11:00 DC2

DC1 12:30Kamil 10:00Helen 11:00

Judith 12:00

Judith 12:00

DC1DC1DC1

DC3

Marie 11:00 Me

12:30

DC1

Replication Basics

Kamil 10:00Helen 11:00 DC2

DC1 12:30Kamil 10:00

Helen 11:00

Judith 12:00

Judith 12:00

DC1

DC1DC1

DC3DC1 10:30DC2 7:00

Kamil 10:00 DC1

Marie 11:00 Me

12:30

DC1

Replication Basics

Kamil 10:00Helen 11:00 DC2

DC1 12:30Kamil 10:00

Helen 11:00

Judith 12:00

Judith 12:00

DC1

DC1DC1

DC3DC1 10:30DC2 7:00

Kamil 10:00 DC1

Marie 11:00 Me

13:30

DC1

Replication Basics

Kamil 10:00Helen 11:00 DC2

DC1 12:30Kamil 10:00

Helen 11:00

Judith 12:00

Judith 12:00

DC1

DC1DC1

DC3DC1 12:30DC2 13:30

Kamil 10:00 DC1

Marie 11:00 Me

13:30

DC1

Replication Basics

Kamil 10:00Helen 11:00

Kamil 10:00Helen 11:00

Judith 12:00

Judith 12:00

DC1DC1DC1DC3

DC1 12:30DC2 13:30

Marie 11:00 DC2

14:15

USN

Each object modification increments USN for that object and for the whole DC

Each DC remembers USNs of its replication partners

repadmin /showutdvec

USN 2USN5001

3USN3001

1USN1001

2 50013 3001

1 10013 3001

1 10012 5001

USN 2USN5001

3USN3001

1USN1003

2 50013 3001

13 3001

1 10012 5001

Kamil 1002John 1003

1001

USN 2USN5001

3USN3001

1USN1003

2 50013 3001

13 3001

1 10012 5001

Kamil 1002John 1003

Notify

Give me

1002, 3

1001

USN 2USN5003

3USN3001

1USN1003

2 50013 3001

1 10033 3001

1 10012 5001

Kamil 5002John 5003

Kamil 1002John 1003

USN 2USN5004

3USN3001

1USN1003

2 50013 3001

1 10033 3001

1 10012 5001

Kamil 5002John 5003

Maria 5004Kamil 1002John 1003

USN 2USN5004

3USN3004

1USN1003

2 50013 3001

1 10033 3001

1 10032 5004

Kamil 3002John 3003

Kamil 5002John 5003

Maria 5004

Maria 3004

Kamil 1002John 1003

2

11

11

USN 2USN5004

3USN3004

1USN1003

2 50013 3001

1 10033 3001

1 10032 5004

KamilJohn

Kamil 1002John 1003

KamilJohn

MariaKamilJohn

50025003

5004

2

11

KamilJohnKamilJohn

Maria

300230033004

2

11

11

USN 2USN5004

3USN3004

1USN1003

2 50013 3004

1 10033 3001

1 10032 5004

KamilJohn

Kamil 1002John 1003

KamilJohn

MariaKamilJohn

50025003

5004

2

11

KamilJohnKamilJohn

Maria

300230033004

Maria2

REPLICATION PROBLEMSActive Directory Replication Issues and Troubleshooting

The Three Problems

Single DC offline for a long time not so long as tombstone! authentication problem

Tombstone lifetime two separate DC zones not a “business” consistency problem

USN rollback restore from snapshot, image, manual

backup total inconsistency!

DC Offline for Long Time

DC1

DC2

DC3

DC2 PWD 21

DC3 PWD 31

PWD 21

Month 0

OLD PWD -

PWD 31OLD PWD -

MyPWD 11

DC Offline for Long Time

DC1

DC2

DC3

DC2 PWD 21

DC3 PWD 31

PWD 22

Month 1

OLD PWD 21

PWD 32OLD PWD 31

MyPWD 11

DC Offline for Long Time

DC1

DC2

DC3

DC2 PWD 21

DC3 PWD 31

PWD 23

Month 2

OLD PWD 22

PWD 33OLD PWD 32

MyPWD 11

PWD 21

DC Offline for Long Time

DC1

DC2

DC3

DC2 PWD 21

DC3 PWD 31

PWD 23

Month 3

OLD PWD 22

PWD 33OLD PWD 32

Kerberos

KDC TGS Ticket

MyPWD 11

PWD 23

DC Offline for Long Time

DC1

DC2

DC3

DC2 PWD 21

DC3 PWD 31

PWD 23

Month 3

OLD PWD 22

PWD 33OLD PWD 32

KDC Disabled TGS

Ticket Kerberos

KDC

MyPWD 11

DC Isolated for Long Time

DC1

DC2

DC3

MyPWD 13

Month 3

Kerberos

KDC

DC1 PWD 11

DC1 PWD 11

KDC Disabled

PWD 13TGT

Ticket

DC Isolated for Long Time

DC1

DC2

DC3

Month 3

DC1 PWD 14

DC1 PWD 14

NETDOM RESETPWD

PWD 14TGT

Ticket

MyPWD 14

KDC Disabled

Lingering Objects

When DC didn’t replicate during the tombstoneLifetime, it halts replication

Can be restored by Allow Replication with Divergent and Corrupt Partner HKLM\System\CCS\Services\NTDS\

Parameters turn on, replicate, turn off

DC4

DC3

DC2

DC1

Objects and Tombstones

FrankStanTania

FrankStanTania

FrankStanTania

FrankStanTania

DC4

DC3

DC2

DC1

Objects and Tombstones

FrankStanTania

FrankStanTania

FrankStanTania

FrankStanTania

DC4

DC3

DC2

DC1

Objects and Tombstones

FrankStanTania

FrankStanTania

FrankStanTania

FrankStanTania

DC4

DC3

DC2

DC1

Objects and Tombstones

FrankStanTania

FrankStanTania

FrankStanTania

FrankStanTania

DC4

DC3

DC2

DC1

Garbage Collection 1/day

Frank

Tania

FrankStanTania

FrankStanTania

Frank

Tania

DC4

DC3

DC2

DC1

Garbage Collection 1/day

Frank

Tania

Frank

Tania

Frank

Tania

Frank

Tania

DC4

DC3

DC2

DC1

Lingering Objects

FrankStanTania

FrankStanTania

FrankStanTania

FrankStanTania

DC4

DC3

DC2

DC1

Lingering Objects

FrankStanTania

FrankStanTania

FrankStanTania

FrankStanTania

DC4

DC3

DC2

DC1

Lingering Objects

Frank

Tania

FrankStan

Frank

Tania

FrankStan

Tania

Tania

DC4

DC3

DC2

DC1

Lingering Objects

Frank

Tania

FrankStan

Frank

Tania

FrankStan

Tania

Tania

Possible Problems

Inconsistent distributed database Proliferation of partial objects

after modification of some attributes

Allow Replication with Divergent and Corrupt Partner blocks replication after tombstone

lifetime Strict Replication Consistency

detects partial objects if replication allowed

Lingering Objects

Lingering Objects

Strict Replication Consistency HKLM\System\CCS\Services\NTDS\

Parameters 1 – do not replicate 0 – request full copy from source

By default only on new Windows 2003+ installations

Automatic Repair Philosphy? Business logic says “deleted already”

should we investigate? Metadata cleanup?

we may need some data from the vesel Remove lingering objects

Removing Lingering Objects REPADMIN /RemoveLingeringObjects

target sourceGUID DN /advisory_mode sourceGUID – healthy DC’s GUID

(without {}) target – suspected DC’s name with

lingering objects DN – naming context DN /advisory_mode just logs the found objects (on the ill DC)

Lingering Object found/deleted

Correct Registry Settings

Long term normal operation Strict consistency = 1 Allow divergent partner = 0

Temporary repair operation Strict consistency = 1 Allow divergent partner = 1

USN Rollback

May or may not be detected Cannot be repaired

not always lingering objects! DC must be denoted/repromoted

unplug network DCPROMO /forceremoval NTDSUTIL Roles NTDSUTIL Metadata Cleanup

USN Rollback

1001DC1

2USN5001

13 3001

Snapshot

1001

USN Rollback

Kamil 1002John 1003

Judith 1004Helen 1005

1001DC1

Eva 1006 2USN5001

13 3001

Snapshot

1001

USN Rollback

Kamil 1002John 1003

Judith 1004Helen 1005

1001DC1

Eva 1006 2USN5001

1 10063 3001

SnapshotKamil 1002John 1003

Judith 1004Helen 1005Eva 1006

Restore

1001DC1

2USN5001

1 10063 3001

RestoreKamil 1002John 1003

Judith 1004Helen 1005Eva 1006

USN Rollback (Detectable)

1001DC1

2USN5001

1 10063 3001

RestoreKamil 1002John 1003

Judith 1004Helen 1005Eva 1006

USN Rollback (Detectable)

1001DC1

2USN5001

1 10063 3001

RestoreKamil 1002John 1003

Judith 1004Helen 1005Eva 1006

Frank 1002Stan 1003

USN Rollback (Detectable)

USN Rollback (Detectable)

USN Rollback (Detectable)

USN Rollback (Detectable)

USN Rollback (Non-detect.)

Frank 1002Stan 1003

1001DC1

2USN5001

1 10063 3001

Tania 1004Mark 1005

Martin 1006Victor 1007Leo 1008

RestoreKamil 1002John 1003

Judith 1004Helen 1005Eva 1006

USN Rollback (Non-detect.)

Frank 1002Stan 1003

1001DC1

2USN5001

1 10083 3001

Tania 1004Mark 1005

Martin 1006Victor 1007Leo 1008

Restore

Victor 1007Leo 1008

Kamil 1002John 1003

Judith 1004Helen 1005Eva 1006

Restoring VM Snapshots

Restore offline HKLM\System\CurrentControlSet\Services\

NTDS Database Restored from Backup =

DWORD = 1 Restart NTDS service

changes InvocationID of the database instance

THANK YOU!

Ing. Ondřej Ševeček | GOPAS a.s. | MCM: Directory Services | MVP: Enterprise Security |ondrej@sevecek.com | www.sevecek.com |

GOPASTECHED 2012

top related