Top Banner
Advanced Active Directory Design and Troubleshooti ng Ed Whittington Principal Software Engineer HP Business Critical Call Center Oct. 06, 2002
180

Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Jan 22, 2016

Download

Documents

Eyal

Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer HP Business Critical Call Center Oct. 06, 2002. Topics. Troubleshooting Basics Troubleshooting Tools DNS Troubleshooting Troubleshooting Replication Troubleshooting DCPromo - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Advanced Active Directory Design and Troubleshooting

Ed Whittington

Principal Software Engineer

HP Business Critical Call Center

Oct. 06, 2002

Page 2: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Topics

Troubleshooting Basics

Troubleshooting Tools

DNS Troubleshooting

Troubleshooting Replication

Troubleshooting DCPromo

Troubleshooting FRS Replication and DFS

Troubleshooting Group Policy

Troubleshooting in .NET

Page 3: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Troubleshooting Basics

Page 4: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Basic Troubleshooting Steps

Define the problem (make sure there is one)

• What’s failing?

• Client authentication and security

• Group policy application.

• Replication.

• Name resolution.

• Errors and warnings in event logs.

• FRS/DFS

• Application

• How is the problem replicated?

• One or multiple machines?

• Narrow the variables

Page 5: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Basic Troubleshooting Steps

MPSReports_DS (from HP or Microsoft)

Get the Log files

• Event logs

– http://www.eventid.net

• %windir%\debug\usermode\Userenv.log

• %windir%\debug\DCPromo*.log

Turn on Verbose Logging

Run NetDiag, DCDiag (verbose)

Get status report from Replication Monitor.

Page 6: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Basic Troubleshooting Steps

• Check DNS.

• Resolver on ALL computers.

• Name Server Properties (forwarding, etc.).

• Monitoring tab – test name resolution.

• Nslookup, ping to test name resolution.

• Ping SRV records.

• Check Replication.

• Force replication.

• Identify who isn’t replicating to whom.

• Outbound vs. inbound.

Page 7: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Basic Troubleshooting Steps

If all else fails, try demoting.

• Really cleans up a lot of problems… If problem is isolated to one DC.

• If replication isn’t working, demotion won’t work.

• Reinstall to remove the AD, then clean up AD

• Ntdsutil to remove server object.

• Delete server object from Sites & Services.

• Delete FRS server object from System container.

• Can manually demote a DC.

Page 8: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Manual Demotion of a DCHKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet

\Control\ProductOptions

Product Type=

– ServerNT (when the computer is a Member Server)

– LanManNT (when the computer is a Domain Controller)

• Change from LanManNT to ServerNT

It’s now a “dirty” member server

Clean server objects from the AD (Ntdsutil)

Clean up the disk and Registry

1. Create new Forward Lookup Zone – Bogus.com

2. Run DCpromo – create new forest for Bogus.com

3. Demote and eliminate Bogus.com

4. Wait for Replication

5. Promote back into domain – use same name if desired

Tool in Windows .NET

Page 9: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Troubleshooting Tools

Gathering Information

Page 10: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Netdiag.exe

NETDIAG.EXE

/v - verbose – always turn this on.

/l - log – writes netdiag.log to default directory.

/d:domain controller – finds DC in domain.

/test: - runs only specified tests.

/skip: - skips specified tests.

Can’t execute remotely.

C:>netdiag /v /l

Page 11: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Netdiag.exe

Domain Controller Discovery

Bindings, IP address, Default Gateway tests

DNS tests

NBTstat and WINS ping

Netstat

Route

Trust

Kerberos

Page 12: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Dcdiag.exe

DCdiag /v

Domain controller functions of netdiag

More domain-specific

FSMO roles

Connectivity

Replications

Domain controller locator

Intersite “health”

Topology integrity

Page 13: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Nltest.exe/server:servername Sets default server

/dsgetdc:domainname Dsgetdcname API

[ /gc /timeserv /ldap ]

/dclist:domainname Lists DCs in domain

/parentdomain Lists parent domain

/dsgetsite Lists site of server

/dsgetsitecov Lists DC “covering” site

/dcname:domainname Lists PDC for domain

/dcpromo Tests potential success of DCPromo

/whowill:domain user Returns name of DC that will authenticate user

Page 14: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Netdom.exe

/join

/add

/reset

/resetpwd

/query FSMO

/trust

Page 15: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

NTDSUtil

• Built-in utility.

• Directly accesses Active Directory.

• Authoritative Restore.

– Can restore an older version of the AD and force it on all DCs to correct variety of problems.

– Entire AD or single tree.

– Can’t restore the schema.

• FSMO Roles.

– List, Transfer, Seize roles.

– Better than UI – can manipulate all roles in forest and all domains from one utility..

Page 16: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

NTDSUtil

Metadata Cleanup

– Delete orphaned objects.

– Servers

– Domains

– The UI can and will lie to you! Don’t trust it.

Useful tool for listing contents of the AD

– Sites, domains, servers, FSMO role holders.

– Domains in site.

– Servers in domain, servers in site.

Q216364, Q216498, Q230306

Page 17: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Gpresult.exe

Run on client

Returns:

• Security group membership

• User and Computer policy info

• GPOs applied to each

• Registry settings set in the GPO

• Client-side extensions set– Scripts applied

Remember

• Policy is cached – reboot / login to clear

• Note who authenticating server is– Environmental Variable “logon server”

Much Improved in .NET!

Page 18: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

GPOtool.exe

Run on domain controller.

Returns:

• Analysis of all GPOs in domain.

• GUID and friendly name of all GPOs.

• DS and Sysvol versions.

• Errors encountered.

Good group policy troubleshooting tool.

May take a long time to process (#GPOs)

Page 19: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

ADSIedit.exe

GUI much like Users & Computers snap-in /Advanced features.

Graphical view of AD.

Like LDP.exe but:

• Easier to browse.

• Can modify attribute values

Don’t confuse with Users & Computers!

Page 20: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

LDP.exe

Takes time to set up:

• Connect

• Bind

• View – Tree

• Enter DN to start (blank for default)

Exposes attributes quickly, easy to see.

Faster than ADSIedit – no GUI to traverse.

LDAP searches.

Can delete and modify, but not as easy as ADSIedit.

Can execute remotely.

Page 21: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo.log, DCPromoui.log

Located in %systemroot%\debug.

Logged every time dcpromo runs.

DCPromo.log

• Shorter.

• Appended (read bottom up).

DCPromoUI.log and DCPromoUI.xxxx.log

• Results of what is seen in the UI – longer.

• Find: Results of getdsdcname, DNS query, Time service sync, authentication, replication, Site info.

• Error (0x0) = success – no error .

Error reporting different – read both logs.

Page 22: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Userenv.log

Located: %systemroot%\debug\usermode

User environment info:

• Group policy (registry)

• Client side extensions– Scripts

– Security

Increase verbose logging (Q221833)

Take time – read and study and you may be surprised at what you can find!

Page 23: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Additional User Mode LogsClient-side extensions

• Registry see Q216357HKLM\software\Microsoft\WindowsNT\currentversion\winlogon\ GPExtension

• Errors created in %windir%\debug\user mode– Named after the .dll

– Scripts = Gptext.dll = gptext.log

– Folder Redirection = fdeploy.dll = fdeploy.log

– Security = scecli.dll = winlogon.log

– Q245422

– Produced automatically on error (except winlogon.log)

– Check User Mode directory for these files

• Invaluable in debugging. Use them!

Page 24: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Client Side Extensions (registry)

Page 25: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Windows .NET Troubleshooting Tools

Page 26: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Remote Desktop Resource Redirection

Client Resources Available when using Terminal Services Remote Desktop

• File System – Local drives and Network drives on Local Machine available on Remote machine

• Audio – Audio streams such as .wav and .mp3 files can be played through the client sound system.

• Port – Applications have access to the serial and parallel ports

• Printer – The default local or network printer on the client becomes the default-printing device for the Remote Desktop.

• Clipboard – The Remote Desktop and client computer share a clipboard

• Terminal Services Virtual Channel Application Programming Interfaces (APIs) are provided to extend client resource redirection for custom applications.

Page 27: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

WMIComputer management

Active Directory

• Provider: MicrosoftActiveDirectory

• Classes:– Replication - See replprov.mof %windir%\system32

Trust health

• Provider: MicrosoftHealthMonitor

• Classes: see system32\wbem\trusthm.mof

DNS

• Provider: MicrosoftDNS

• Classes: system32\wbem\dnsprov.mof

Cluster

• MSCluster

Also look in CIM Studio in MSDN

Page 28: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

WMIC Sample CommandsLook in %windir%\system32\wbem *.mof files for names of providers, classes, etc.

Active Directory

• Provider: MicrosoftActiveDirectory

• wmic:/namespace: \\root\microsoftactivedirectory PATH msad_replneighbor

(shows replication partners)

• wmic:/namespace:\\root\rsop\user path RSOP_GPO

(lists GPOs with User settings)

Page 29: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Admin Tool ImprovementsUsers and Computers snap-in

• Drag and drop.

• Multi-select and edit user objects.

• Heavily revised object picker.

Users and Computers, Sites and Services, DNS Snap-ins

• Saved queries.

• Viewing Saved DS, DNS, FRS eventlogs on non-DCs!

.NET Adminpak (only on XP)

Page 30: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Command Line Tools

GPresult

• Enhanced reporting

DCDiag

• dcdiag /test:DCPromo

Repadmin – enhanced reporting

Netdom – computername for DCrename

Others

Shipped on

• Service Pack 2 CD (install manually)

• .NET Server, AdvSvr CD

Page 31: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Windows .NET Improvement to NTDSUtil

Change Offline, DS Repair Mode Password While Online!

NTDSUtil

• Set DSRM Password (main menu)

Increases server up-time limited by password change interval in Win2K.

• (Had to reboot to DS Repair mode to change.)

• Q223301 (Win2K limit)

Cool error message!Setting password failed.

WIN32 Error Code: 0x6ba

Error Message: The RPC server is unavailable.

See Microsoft Knowledge Base article Q271641 at

http://support.microsoft.com for more information.

Page 32: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Errors in Windows .NET Kinder, Gentler and Report to Microsoft

Page 33: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Active Directory Load Balancing Tool

Does the job of branch office deployment.• KCC chooses BHS for connection objects – choose the same one.

• Tool allows you to spread the load to other DCs in the site (that have that NC).

• ADLB tool modifies the Hub DC’s replication schedules to spread it out over time.

• Generates a log – like replmon’s status log.

• For Deployments with hundreds of branch offices all replicating to a single hub..

• Tool=no benefit to sites with only one DC per domain.

Page 34: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Future: Graphical Replication Monitoring Tool

Very much like ‘Age of Directories’

Ability to make configuration changes

Not in .NET - maybe Longhorn or Blackcomb?

Page 35: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Troubleshooting DNS

Page 36: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DNS Resolver Configuration

Win2K clients, servers point to Win2K DNS Name Server that is SOA for their zone.

• Don’t point to ISP, other Internal NS.

(even as “additional”.)

• Keep it simple.

Win2K Name Servers forward to ISP or internal name server hosting registered domain.

Page 37: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DNS Name Server Configuration Basics

• Dynamic updates = Yes. • Active Directory Integrated Zone

• Select one “Primary”• All other ADI Primary NS point to it for DNS

• Win2k Name Servers can:• Forward to ISP or Internal NS.• Use root hints (or modify root hints).

• Reverse Lookup Zones NOT required• Needed only for tools - NSLookup

Page 38: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

ADI Primary and Standard Secondary mixed zone• Only a DC can host an ADI primary zone• Member Servers can host Secondary zone

• Synch off of an ADI Primary

Secondary

Secondary

ADI Primary

ADI Primary ADI Primary

Page 39: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DNS Case Study

sa.corp.net eu.corp.netna.corp.netcorp.net

na.corp.net

sa.corp.net

eu.corp.net

Zone xfersZone xfers

ForwardingS

eco

nd

ary

zon

es

Page 40: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DNS Case Study

sa.corp.net eu.corp.netna.corp.netcorp.net

eu.corp.net

sa.corp.net

na.corp.net

find na.corp.net

Page 41: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

With Conditional Forwarding FeatureIn Windows .NET Server…

sa.corp.net eu.corp.netna.corp.netcorp.net

find na.corp.net

Page 42: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Problem: SRV records only in Root domain

corp.comw2k.net

= Forwarder

NA.w2k.net EU.w2k.net

corp.com

= Zone Xfer

Location of SRV:

PDC

GC

Cname

Page 43: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Solution: Delegate _msdcs zone

corp.com

_msdcs

_tcp

_sites

_udp

w2k.net

= Forwarder

NA.w2k.net EU.w2k.net

_msdcs

= Delegation

Location of SRV:

PDC

GC

Cname

Page 44: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DNS Hotfix

Symptom: Replication breaks

Configuration: Using Secondary Zones for root _msdcs at child domains.

Problem: Serial Number of Secondary zone is higher than the primary – zone transfers stop.

Hotfix Q304653 • The Serial Number Is Decremented in DNS When You Reboot

• Solved in .Net

Page 45: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DNS Troubleshooting Basics• Check DNS event log (and others).• Check Location of DNS servers.

• Usually want Name Server in remote sites.• Check population of SRV records.

• _msdcs; _tcp; _udp; _sites• Need Kerberos, LDAP records for each DC.• Correct address, etc.• Can delete, repopulate by restarting netlogon.

• Check Delegations – correct names, IP.

Page 46: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DNS Troubleshooting Basics

• Use of Active Directory Integrated (ADI) zones.• Put standard secondary zones on mbr svrs.• Can clear problems by switching to Std Pri.

• Ping DC by SRV record:• ping <guid>.site._msdcs.compaq.com.• Clear the server cache.

• Negative Caching problems.• Test – Server Properties – Monitoring tab.• Test – Ping names, NSLookup.

Page 47: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Troubleshooting AD Replication

Page 48: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Replication Troubleshooting Tools

Event logs – Directory Services, System

Sites and Services snap-in

Age of Directories (AOD) – HP

Replication Monitor

Aelita Event Admin

NetPro Directory Analyzer

Command Line (Support Tools & Res Kit)

DCdiag, Netdiag

Repadmin.exe

Page 49: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Event Logs for Replication Troubleshooting

Directory Services Log

• 5778 - Subnets not mapped.

– Will break client’s “site awareness.”

• 1311 - serious - Not enough connectivity.

– Connectivity, traffic issue.

– Sites with DCs and no site links.

– Site topology incorrectly defined.

• DNS Lookup failure.

• 1772 – RPC Server is unavailable.

– Physical connectivity.

– DNS.

Page 50: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Event Logs for Replication Troubleshooting

System Log

• Netlogon errors

– Authentication

– Trusts

– Secure channel

• w32Time errors

– Kerberos authentication required for replication

– DCs must be no more than five minutes out of sync.

– Watch time zones!

Page 51: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Sites and Services Snap-in

Check for duplicate connection objects.

• KCC generating >1 connection between 2 DCs.

• Delete all connections and select “check replication topology” option to regenerate them.

• If they come back, find out why.

– Usually a DNS problem.

• Breaks FRS and AD replication.

Page 52: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Sites and Services Snap-in

Check for sites with no DC’s…

• OK to have a site with no servers if you plan it that way.

• If there should be a server in that site, find it and move it there.

Make sure all subnets are mapped to correct sites.

• Keep up on IP addressing changes.

Page 53: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Sites and Services Snap-in

Make sure site links are correct.

• Link correct sites per design (need a drawing).

• Cost, schedule, replication frequency.

Force replication between DCs.

• All connections are inbound.

• Use “check replication topology.”

• Create new site, user named for the DC.– Checks Configuration NC and Domain NC.

– Force Replication Between Replication Partners.

– On DC1 from DC2 and on DC2 from DC1.

Page 54: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Sites and Services Snap-in• Validate inbound, outbound replication on all DCs.

– Create new site, user named for the DC.

– Checks Configuration NC and Domain NC.

– Wait for replication (don’t force it).

– Check each DC for copy of these users, sites.

DC1 DC2 DC3

User Site

DC1 DC1

DC2 DC2

DC3

User Site

DC2 DC2

DC3 DC3

User Site

DC1 DC1

DC3 DC3

Page 55: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Check Cname DNS Records

• In root _msdcs zone (only), alias record mapping DC’s FQDN to its server GUID.

Only one record.

– Delete duplicates.

Match GUID in alias record to GUID reported by Repadmin /showreps.

If in doubt, delete DC’s Alias record(s) and re-start netlogon on broken DC to re-register .

Page 56: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Age Of Directories Tool - Demo

If interested, contact me [email protected]

Page 57: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Replication Monitor

Status report (replication health report)

List of all GCs, BHS, Trusts

List of all replication errors on all DCs in domain

Changes not replicated

Replication partners

Force push/pull replication

Meta-data

Group Policy Object status

FSMO validation

Inbound connections (including reason)

Page 58: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Replication Monitor

Page 59: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Command-Line Utilities

RepAdmin

• In Support Tools.

• Perhaps the most useful tool for troubleshooting replication.

• /showreps - lists inbound, outbound connections.

– Only one to list outbound connections.

– Lists Server GUID (used for replication).

– Lists successful replication messages.

– Lists replication errors.

– Lists Replication partner used to replicate every naming context – inbound and outbound.

Page 60: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

NTDS Diagnostic Logging

HKLM\system\CCS\Services\NTDS\diagnostics

• Set value = 0-5

– 0 = off 5=very verbose

– Start with 3 to begin with

– Reported in Event log

• Important Values

1 Knowledge Consistency Checker

13 Name Resolution

5 Replication Events

8 Directory Access

9 Internal Processing

18 Global Catalog

Page 61: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Things that break Replication(or indicate that it’s broken)

Duplicate connection objects

Orphaned objects

• Esp. DC objects, caused by a DC being removed from the domain without successful DCPromo.

• Garbage Collection initiated manually before all DCs and GCs are fully replicated.

• Reported in event logs.

Page 62: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Things that break Replication(or indicate that it’s broken)

DC unavailable

• Down

• Name Resolution

• Network problem

DNS misconfigured

• TCP/IP addresses change

– Delegation

– Client resolver configuration (including name servers)

– DHCP scope configuration for DNS registration

• Failure to Contact a DNS server (for SRV records)

Page 63: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Things that break Replication(or indicate that it’s broken)

KCC doesn’t do it’s job

• Routes around inaccessible DCs by creating duplicate connection objects.

• When DCs come back on line, KCC should clean up the duplicate connection objects.

– Usually doesn’t…

– Causes replication errors.

– Events in the DS Log.

– Need to clean them up manually.

Page 64: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Lingering Object Behavior

Basics

Scenerios

Page 65: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Object Deletions

Deleted objects turn into tombstones• Tombstones replicated to other DCs• This is how replication partners learn that an object was deletedTombstones purged from local database after tombstone lifetime has expired• AD: 60 days, adjustable (2 days minimum)• Sysvol: 60 daysIf tombstone does not replicate to a DC, object deletion is not replicated• Object not deleted on this DC• Object is now a Lingering Object• Can be on DC or GCRule: tombstone lifetime =• Max time DC can be disconnected• Max lifetime of Backup tape

Page 66: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Lingering Objects – Scenarios

Deleted object re-appears on all domain controllers in a domain and on all GCs

Deleted account does not disappear from Exchange GAL

Object was moved between domains and disconnected GC is brought online

Replication error on GC when new object is created

• Lingering object still holds attribute where uniqueness is enforced (samAccountName)

• Exchange cannot create mailbox because object already exists

Page 67: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Why does this Happen????

DCs disconnected for more than tombstone lifetime

• Left in storage room for long time

• Replication failures– I.e., bridgehead servers overloaded, no monitoring in place

• WAN connections down for a long time– Tombstone lifetime abuse

– “Somebody” changed time on a DC to garbage collect an object

– Tombstone lifetime was changed to garbage collect objects on single servers

Can this be avoided?

• YES, monitor KCC topology and replication

• Do not set tombstone lifetime to less than 60 days

• DCs offline > tombstone lifetime must be re-promoted

Page 68: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Lingering ObjectsStrict vs. Loose Replication Behavior

Replication Behavior

• Defines how DC reacts if an update for an object is replicated in, and the object does not exist on DC

Loose Behavior

• DC requests full copy from replication source

• Logs event ID: 1388 Strict Behavior

• DC stops replication from offending replication source

• Logs error code 8240 (ERROR_DS_NO_SUCH_OBJECT) embedded in event ID 1084

• Requires logging level 1

Behavior can be set via registry key• HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NTDS\Parameters\Strict Replication

Consistency

• Introduced in Q314282

Page 69: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Deleting Lingering Objects

If found on a DC• In loose behavior: Delete the object via users and computers• In strict behavior: Follow procedures outlined in Q314282On GC (in read-only NC)• Object cannot be changed or deleted on GC• Solution 1: Delete object on writeable replica (if possible)• Solution 2: Use ldp to delete the object on the GC

– Support to remove lingering objects from GC added in Q314282– Follow procedures outlined in Q314282

You might have to set loose behavior temporarily

Page 70: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Best Practice Recommendations

DC has not replicated for more than 60 days

• Tombstone lifetime default (60 days)

– Do not replicate, re-install OS

• Tombstone lifetime adjusted to > 60 days

– 60 days < time DC disconnected < tombstone lifetime

– Re-connect DC, restore sysvol

– Time DC disconnected > tombstone lifetime

– Do not replicate, re-install OS

If you have to disconnect a DC

• Make sure that it replicates successfully before you take it off-line

New deployments

• Add registry key to enforce strict replication behavior at DC OS installation time

Page 71: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

More Best Practice Recommendations

Existing deployments

• Default setting: Loose replication (even on SP3)

• Goal: Get to strict mode asap

• Set registry key to strict mode on all DCs

• Watch event logs on DCs

– If you get many replication errors on single DCs, re-promote DC

– For small number of replication errors, clean-up the DC

– Delete lingering objects if necessary

– Follow procedures outlined in Q314282

• If you were monitoring…

– Then don’t worry, you won’t see any replication errors

Don’t lower tombstone lifetime to less than 60 days

Monitor!

Page 72: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Lingering Object Fix

Q317097 (good instructions)

HKLM\System\CurrentControlSet\Services\NTDS\Parameters…

• Add Value Name = Correct Missing Object

• Data Type =REG_DWORD

• Value = 1 (tight)

0 (loose)

Allows or Restricts AD replication when lingering objects are discovered.

• Tight when you want to know.

• Loose to inventory and remove the objects.

Page 73: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

WNT: Object Replication

• change to attribute or value

W2K: Attribute level replication

• Better than NT (more efficient)

• Change to attribute replicates attribute

• Change to value replicates attribute

• Problem: Multi-Valued Attributes– Group = Attribute

– Member = Value

– Change Member = replicate attribute with all members

– Impacts network traffic

– Limit (per Microsoft) of 5,000 users/group

.NET: Value Level Replication

• Replicates values – not attributes

• Eliminates 5,000 user/group limit

Value Level Replication

Page 74: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Domain Limit

There is a limit of about 800 child domains to a single parent

Child domains are unlinked, multi-valued attribute – stored in the crossref attribute of the domain object

Jet database limits the data that can be stored. No way to patch – must change Jet

“Might” be improved in Longhorn (not Whistler)

Page 75: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Domain Limit

One customer got to 900 domains

• Replication failed

• Authentication failed

• Mission critical application failed

Temporary Repair

• Demote all domains in reverse order of creation to return to 800

• Fixed Replication

Solution

• Redesign and redeployed to a single domain

Page 76: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Troubleshooting

Page 77: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Basics

First Test of:

• DNS registration and resolution .

• LDAP query and response.

• Kerberos authentication.

• Active Directory replication.

• FRS replication.

• Application of group policy.

Validation and Flow …

• Chapter 2, Active Directory Data Storage in the Windows 2000 Resource Kit

Page 78: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Logs

%windir%\debug• Dcpromo.log

• Dcpromoui.log

• Dcpromoui.xxx.log

Set verbosity on dcpromoui.log• HKLM\Software\Microsoft\Windows\CurrentVersion\AdminDebug

• Values: DCpromo and DCPromoui

• Data– 380001 = Default

– 0xFF003 – full file and debugger logging output

– 0xFF001 – maximum detail to DCPromoui.log

Page 79: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Phases

Initialization

• UI Input - DNS Name resolution

• LDAP Query/resp - Kerberos Authentication

AD Replication

FRS Replication

Wrap Up

• Apply policy - Upgrade Trusts

• Publish new DC in the DS

Page 80: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Initialization Phase

Authorization error

• Enterprise Admin required to create new domain (or to remove the last one).

• Domain Admin required to add replica DC (or demote a replica).

Can’t find DNS with Dynamic Updates.

• Prompt to let DCPromo configure DNS.– Creating domain.

– Answer NO!

Replicas, Child – must find DNS server to locate a “sourcing DC.”

Page 81: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Errors Creating the Computer Account

Need privileges to create the account.

First creates the account, puts it in domain/computers container.

Then puts it in domain controller’s OU.

Source DC identified in DCPromo logs.

Page 82: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Initialization Checklist

Privileges required

• Enterprise Admin if creating new domain.

• Domain Admin if creating a replica.

System time configured properly

• Kerberos requires sync within five minutes.

• All parent, child domain DCs.

Sufficient free disk space.

• ~850 MB

Domain Naming Master FSMO required if creating new domain.

Page 83: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Initialization Checklist

Everyone or Enterprise DC group has “Access this computer from network”

Enterprise DC group rights:

• Manage Replication Topology.

• Replicating Directory Changes.

• Replication Synchronization.

Sourcing DC

• Security policy applied.

• Enable Computer and user account to be trusted for delegation.

Page 84: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Initialization Checklist

Target DC has valid Kerberos tickets.

• Kerbtray.exe utility from Resource Kit.

GC must be contacted.

• Nltest /dsgetdc:compaq.com/GC

Able to contact a functional existing DC.

• Uses UDP (watch for firewall issues).

– Can use TCP but it’s a Microsoft Secret!

• Use Ping, NLTest, Nslookup to find a DC.

Page 85: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

If Source DC not Reachable...

See if one responds.

• Ping FQDN of domain (Ping compaq.com).

• NLTest /dsgetdc:compaq.com /ds

– Other: /gc /pdc /timeserv

• Check Site mapping for this computer.

– Nltest /server:<name> /dsgetsite

Check Dcpromoui.log to see source.

Force DCPromo to use a specific source

• Q224390

• Turn off Netlogon on other DCs.

Join the Server to the domain then DCPromo.

Page 86: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Info to Collect for Debug

Netdiag /v

• Problem DC

• Source DC (see dcpromo.log)

DCDiag /v

• Source DC

Replication working? (other DC in site)

Page 87: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

AD & FRS Replication PhasesInitially inbound connection created to replicate from source DC.

• Machine acct (DC1$) moved to DC OU.– UserAccountControl Attribute set

– 4096 (1000 hex) = Workstation/Server

– 532480 (82000 hex) = DC

– Account is moved.

• Error: DC1$ not found, access denied, etc.– Credentials of account running Dcpromo

– Source must have computer object.

– Source must have security policy applied to itself.

– Q250874

Page 88: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

AD & FRS Replication PhasesAfter first reboot…

• Outbound connection created.

• AD changes for new DC replicated to source.

– Including UserAccountControl attribute.

– Server (Replication) object.

– Replicated to other DCs.

• Sysvol is populated (policies copied to new DC).

• Sysvol and Netlogon Shares created.

Page 89: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Troubleshooting Missing Sysvol, Netlogon Shares

Outbound connection failed

• Look in Sites and Services or Repadmin

• UserAccountControl still 4096 on source

[Q257338] – Good but …• Build manual “outbound” connection• Force KCC to “Check Replication Topology”• Check UDP traffic if in a remote site.

Page 90: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Missing Sysvol and Netlogon Shares

Create replication “links” manually then force replication:

• Repadmin /add (adds outbound link)

• Repadmin /sync (forces replication)

Can’t create them manually. When Replication is fixed, they’ll get created.

Page 91: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Tracking Down a GUID

Problem: GUID referenced in event log. What is it?

Solution: (Q216359)

• LDP – search for the GUID

• Search.vbs in Support tools

Orphaned Object (will kill replication)

• Turn up NTDS diagnostic logging

– Internal processing

– Replication

• Find object (GUID) in event logs

• Delete it via LDP

Page 92: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Improvements in Windows .NET

Page 93: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Install From Media (IFM)Source Replica AD from Media in DCPromo

• GCs or DCs (Replica only).

• No initial replication from a DC.

– Faster (no searching for a DC).

– Less network impact (No full sync on the WAN).

– Easy branch office installation.

• After initial load, replicates changes.

• Network connectivity still required.

• Unattended Answer File Support:

– ReplicateFromMedia

– ReplicationSourcePath

Page 94: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Install From Media (IFM)Unattended Answer File Support

• ReplicateFromMedia

• ReplicationSourcePath

Media must be local drive.

Media useful life < 60 days.

How?Use Backup Files/Media

• Create first DC in domain.

• Back up DC.

• Restore to Media (local disk, CD, …).

• C:>dcpromo /adv.

• Wizard produces an additional screen…

Page 95: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer
Page 96: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DCPromo Answer FileSee Q223757[Unattended]

Unattendmode=fullunattended

[DCINSTALL]

UserName=administrator

Password=Password3

UserDomain=corp.net

DatabasePath=c:\windows\ntds

LogPath=c:\windows\ntds

SYSVOLPath=c:\windows\sysvol

SafeModeAdminPassword=Password2

CriticalReplicationOnly

SiteName=Seattle

ReplicaOrNewDomain=Replica

ReplicaDomainDNSName=corp.net

ReplicationSourceDC= ! Leave this blank for IFM

ReplicateFromMedia=yes

ReplicationSourcePath=e:\DSrestore

RebootOnSuccess=yes

Page 97: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

File Replication Service (FRS) Basics

Page 98: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS Background

File Replication Service

• Replicates file system portion of policy

• Optional replication engine for DFS

Concepts

Challenges

• Journal wraps

• Staging File backlog

• Reconciliation / Morphed Directories

Page 99: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Concepts

Objects in DS

• Members, Subscribers, Conn. objects, filters

• Depends on AD replication

• Determines partners and schedule

NTFS USN Journal

• Used by FRS to track changes to NTFS volumes

Staging File and Directory

• Rename safe

• Compression support

Database

• Record of incoming, outgoing & existing files

Page 100: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

File Replica Service (FRS)

Replaces NT 3.X\4.0 LMREPL service

Replicates SYSTEM Policy, Group Policy, DFS

• Group policy templates

• Ntconfig.pol & logon scripts for down-level clients

– NETLOGON Share

• DFS share contents

Multi-threaded replication engine

• Replicate different files to different computers simultaneously.

Page 101: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Terminology

• Computer A and B replicate DFS+SYSVOL

• B is computer A’s outbound partner

• A is B’s inbound partner.

• A is B’s “upstream” partner

• Changes flow “downstream to B

Computer A

Computer B

Upstream Downstream

A’s Outbound partnerB’s Inbound partner

Replication

Page 102: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Basic Operation

3Notify Replication partners (replicas)

of changes

1 DC1

GPO

Change created on DC1

GPO

2Temp File moved

to staging directory

Pull

Partners pull changes from DC1

DC2

4

Page 103: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

File and Folder Filters

Excluded from FRS Replication:

• Computer specific EFS files/folders

• File names beginning with ~

• Files with .bak or .tmp extensions

• NTFS Mount Points

• Reparse points

Configurable for DFS shares

Page 104: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

The Replication Process

GPO

\winnt\sysvol\sysvol\compaq.com\policies \winnt\sysvol\

staging\domain

\winnt\sysvol\staging areas\compaq.com

DC1

Notify Partners

AD Object version updated

Page 105: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

The Replication Process

GPT.ini

/\winnt\sysvol\sysvol\compaq.com\

policies

DC2

/\winnt\sysvol\sysvol\

DO_NOT_REMOVE_ntfrs_PreInstall_Doma

in

Pull Sysvol version of

GPO updatedDC1

Page 106: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS Replication

Observe File Replication Process

• Edit a group policy – modify and save it.

• Copy of changed file goes to staging and staging areas directories.

• Copied to staging/staging areas directories on other DCs..

• Moved to sysvol\sysvol directory on the DC.

• Group policy file is updated.

Page 107: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Distributed File System (DFS)

Page 108: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DFS BasicsDomain-based (Win2K) vs Standalone (NT)

Root

• Must be on a DC.

• Contains PKT.

• DFS service.

Replica

• PKT from DC, stored locally.

• DC or Member Server.

FRS Replicates Data between DCs

• Member servers DFS replicate data to share via DFS service.

Site Aware (clients locate “closest” DFS Replica)

Page 109: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

The DFS Replication Process

DC2

Replica

DC1 - Root

SVR2

Replica

SVR1

Replica

DataData

DFS service

FRS

Data

Page 110: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DFS Troubleshooting

Symptom: Shared folders not in sync.

Make Sure DFS service is started on all servers and DCs.

Make sure AD Replication is working.

Make sure FRS is working.

DFSUtil.exe.

Watch for applications that keep files open.

• Anti-virus.

• Defragmenters.

Page 111: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS TroubleshootingTechniques

Page 112: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Basics

Remember…

• You MUST install latest service pack and hot fix.– Post SP2 (SP3) Hot fix Q307319

– Don’t go any further until this is installed.

• “Multi Master” characteristics replicates changes (and problems) quickly. Turn off the FRS Service to get control.

• FRS depends on AD Replication, which depends on DNS.

Page 113: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Diagnostic ToolsEvent Viewer: FRS log, DS Log

NTFRSutl.exe

• /outlog – outbound logs

• /inlog – inbound logs

• /ds – directory service

NTFRSxxx.log in \winnt\debug

NTFRS Health Check utility

• HP, Microsoft

Netdiag, DCDiag

AD replication tools

Page 114: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS Replication

What happens if it breaks?

• Changes not replicated to all DCs, resulting in inconsistent AD

• Group policy gets out of sync and may not get applied.– GPOTool: Version mismatch

• Logon scripts don’t get applied.

• DFS shares out of sync.

Page 115: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS ReplicationHow to tell if it’s broken

• Events in FRS log

– Event 1000, 1001 in app log every five minutes.

• Files backed up in staging areas

– Get size of staging directories (MB).

– Get date of oldest file (how long it has been broken).

• Group Policy not applied (new changes)

Page 116: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Ensure DNS is working.

• DNS Lookup Failures in events (description).

• Ping, Nslookup to resolve names.

– Domain name

– DC, Server names

Ensure AD Replication is working.

• Create New Objects and see if they replicate.

• Repadmin/showreps and /showconn

• DS Event Log

• DCDiag

Replication Problems

Page 117: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Staging Areas should have no files

• Common FRS problem.

• Check size of dir, date of files.

Ensure FRS is working.

• Create text file on each DC, named for the DC.

• Put it in \winnt\sysvol\sysvol\<domain name>.

• All DCs should have copy of all DCs’ text files .

Replication Problems

Page 118: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS Event Log

• 13508 – Normal…but watch them

• 13509 – success after having 13508s

• 13514 – When Sysvol share not created “FRS preventing computer from becoming a DC”

• 13553,13554 – FRS successfully added computer to replica set (DCPromo successful)

• 13557 – Duplicate Connection Objects

• 13522 – Staging area full Q264822

• Lots of KB Articles: Search for “FRS and Event”

Replication Problems

Page 119: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

\WINNT\DEBUG

Identify errors, warning messages and milestone events in the log files

Very difficult to interpret

Interpreting the Logs NTFRS_000x.log

Page 120: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

NTFRSutl.exe

Ntfrsutl inlog = Lists inbound log

Ntfrsutl outlog = Lists outbound log

Ntfrsutl sets = Lists replica sets

Ntfrsutl DS = FRS’s view of the DS

Can execute remotely:

Ntfrsutl sets DC1

Page 121: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Group Policy Troubleshooting

Page 122: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Group Policy Troubleshooting BasicsPolicy isn’t getting applied

• Set something easy – Admin Templates

– User Settings: Log off/on

– Computer Settings: Reboot

• Client-side extensions act as separate policies – debug separately from Admin Templates

– Folder Redirection

– Scripts

– Disk Quotas

– Security

– IE Branding

– EFS Recovery

– IPSec

– Application Management

Page 123: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Group Policy Troubleshooting Basics

Policy applied, but settings not effective.

• Userenv.log (verbose) Q221833

• Set Diagnostic logging Q186454HKLM\software\Microsoft\WindowsNT\CurrentVersion\Diagnostics

Value: RunDiagnosticLoggingGroupPolicy

Value Type: REG_DWORD

Value Data: 3 (value 0-5 0=off)

– Change One setting in GPO

– Logoff/on or reboot

– Verbose info in Application log

– Lists all registry settings applied to user

– Turn it off afterward – fills the event log fast!

Page 124: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Gpresult.exe

Resource Kit command-line utility.

Reports applied policy for user, computer.

• DN

• Security groups

Verbose mode – gpresult /v

• Registry settings

• Computer: Client-side extensions.

WATCH:

• Logon server.

• Cached policy on client may mask solution.

• Refresh Policy – make sure it’s applied .

Page 125: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

GPOtool

Resource Kit command-line utility.

Run on DC only.

• Version Comparison: AD vs. Sysvol.– AD version set immediately on change.

– Sysvol version set after FRS Replication.

• Friendly name /GUID associationPolicy {08FAB736-9628-41D5-B5A8-37A0F98D7E43}

Policy OK

Details:

------------------------------------------------------------

DC: Qtest-DC2.qtest.cpqcorp.net

Friendly name: Folder Redirection Policy

Page 126: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Solving Version Mismatch

Small mismatch is normal.

• After change until FRS Replication completes.

• Be patient – see if it resolves.

Big mismatch is bad.

• Prevents application of policy.

• Unreplicated changes.

• Manually set FRS version = AD version.– %windir%\sysvol\sysvol\<domain>\policies\{guid}\gpt.ini

– Will lose changes.

Page 127: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Resetting Default Domain Policy or Default DC Policy

These policies are always same (GUID).• Default Domain: {31B2F340-016D-11D2-945F-00C04FB984F9}

• Default DC: {6AC1786C-016F-11D2-945F-00C04FB984F9}

Changes are a mess – need to restore default.

To restore security defaults only, import the BasicDC.inf template (Q258595).

If settings are hosed, copy an original copy of the policy to winnt\sysvol\sysvol\ <domain>\policies.

• Copying policies only supported for these two cases.

• Other will have different GUIDs.

• Can’t copy other policies from one forest to another for debug.

Page 128: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

How to copy the Default Domain and Default DC policy

1. Get a copy of a clean, default policy folder.

– Restore the policy folder (GUID) from backup.

– Create new domain and copy the GUID folder from that machine .

– Don’t zip it .

2. Delete existing policy.

3. Wait for replication.

4. Copy new policy folder to winnt\sysvol\sysvol\<domain>\policies.

5. Wait for replication.

6. Run GPOtool to make sure it shows up on all DCs.

Page 129: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Unable to Edit Group Policy

Group policy changed on PDC by default.

If PDC is not available.

• Dialog: Change on any DC, current DC or not.

• Error: Unable to contact Domain (no DC).

Solution: Transfer or seize the PDC role to another DC.

Can set policy to NOT use PDC …. Don’t!

Page 130: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Using Userenv.log to solve Group Policy problems

Turn on Verbose Logging Q221833

interpreting group policy information in userenv.log

Page 131: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Debugging Logon Scripts (script doesn’t apply)

Configure it via group policy snap-in.

Make sure policy is applied.

• Set a desktop setting.

• Use Gpresult /v.

• Enable verbose logging for Userenv.log.

Turn on “Run logon scripts visible.”

Create simple logon script as a .bat file to make sure it’s not the script failing.

Example: Using Userenv.log to find script errors.

Page 132: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Can’t find FSMO Role Holder

Problem: Operation trying to contact a FSMO role holder – PDC Emulator or…?

• Can ping by name – seems to be ok

• Operation can’t find it

Solution:

• Find out who has that role:

netdom query fsmo

(returns a quick list)

• Transfer the role to a local DC

Page 133: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Group Policy Refresh Anomaly

Users complain of a 5-25 second “hang” intermittently in any application – Outlook, Word, 3rd party apps. Keystrokes are buffered and they can continue to work

Noticed direct correlation between the 1704 events (GP Refresh) and the “hang”.

Change refresh interval via group policy and the frequency of the “hang” changed.

Page 134: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Group Policy Refresh Anomaly

Cause: SceCli applies group policy every 16 hrs (default) if no gpo changes have occurred. (DCs are every 5 minutes)

• Broadcasts WM_settingschanged to all top level windows

• Wakes up sleeping processes causing massive paging in/out of memory – causing hangs

• More pronounced on “slower” computers

Solution: Configure Policy Refresh Interval in Group Policy so refresh occurs every 12 hrs at midnight/noon so users don’t notice it.

Page 135: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Account Lockout

Background

Finding locked out user accounts

Client Bugs and Fixes

Server Bugs and Fixes

Resolution and Futures

Page 136: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Lockout Reasons & Options

Prevent spoofing or hijacking account

Optional event logging in Audit Policy

Account Lockout Options

• Timed lockout

– Account enabled after admin defined time

• Hard lockout

– Account disabled until reset by admin

• Lockout policy defined in group policy

– Single lockout and password policy per domain

– Location: default domain policy

Page 137: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Account Lockout on DC’s

Each DC records # of bad password attempts

BDC check PDC for latest password

All Bad password attempts seen by PDC

• PDC always 1st to lock out account

• PDC urgently replicates lockout when threshold reached

• Bad password attempts not replicated by DC

BadPasswordCount reset to 0 on 1st good password

Page 138: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

PDC chaining operations

If BDC fails authentication with:• STATUS_WRONG_PASSWORD• STATUS_PASSWORD_EXPIRED• STATUS_PASSWORD_MUST_CHANGE• STATUS_ACCOUNT_LOCKED_OUT • Referred to as “BadPasswordStatus”

BDC chains authentication to PDC• Return status from PDC if status = success or listed above• Otherwise, ignore PDC status and use local status

Exception to PDC chaining• AvoidPDCOnWan enabled and PDC in remote site (Q225511) • 10 “BadPasswordStatus”events logged in 10 minutes

– NegativeCache enhancement Q263821– Cache reset after good password entered

Page 139: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Troubleshooting account lockouts

Your goal: Answer the 4 W’s

• Who, Where, When and Why

Environment setup

• Enable Auditing in domain policy– Account Logon Events – Failure

– Account Management – Success

– Logon Events – Failure

– Security Event log on DC’s: 10K events + over-write

• Enable netlogon logging (ntlm clients)– NLTEST /DBFLAG:2080FFFF (no reboot)

• Enable Kerberos Logging– Q262177: Kerberos logging (kerb clients)

Page 140: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Account Lockout – Where

DC Resources • NTLM Clients

– Search DC & CLIENT NETLOGON.LOG for lockouts– 0xC000006A = bad passwords – 0xC0000234 = account lockout

• NTLM + Kerberos Clients– Search DS Event Logs– Q230254, Q299475, Q273499 and Q301677 for description– 644: NTLM + Kerberos Lockout Event– 675: Kerberos badd password – 681: NTLM bad password– 529: Failed logon– 531: Account disabled

Tools

• EVENTCOMB• AL.EXE• NETMON.EXE

Page 141: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

EVENTCOMB

Page 142: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

AL.EXE

Page 143: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Account Lockout: Why

Attack, “Pilot Error” or Bug• Wrong Password entered, mis-configured Service Account Scenario• Account type: user, computer or service account• Lockout trigger?• logon, drive access, following p/w change)

Drill Down: Look at TOD, pattern & frequency• Process related lockouts

– Structured pattern– Logged when users not present– Look for:

– common services, applications, client configuration

• User related lockouts– Random pattern, – Fewer events logged– Look at:

– shortcuts, mapped drives, logon scripts, applications

Page 144: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Account Lockout – Client

Win9X• Q278558: Access denied to a mapped drive after disconnect

• Q272594: Client can't log on after log off w/o reboot

• Q293793: VREDIR looses file tracking structures

• Q271496: One unsuccessful logon attempt triggers lockout (1:3)– Net use + dsgetdc + logon attempt.

• Q266772: Logon fails if Unicode string password to NTLM SSPI

DS Client on Win95, Windows 98, 98 Second Ed• DSCLIENT *MUST be installed before any hotfixes!

– Q301344, Q283261– DS Client lets WIN98 account lockout fixes work on Win95

Win2K• Q275508: User locked when accessing home dir after changing p/w

• Hotfix or SP2

Windows XP

• None

Page 145: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Account Lockout: Server Fixes

Read server side KB articles

• Q287639: Win9x Clients Locked Out after unlock

– MSV1 package does password check against BDC with old password during 2nd phase of logon

• Q278299: Bad p/w count not reset to 0 (ntlm)

– Original hotfix had regression. Confirm latest version deployed.

• Q263821: Bad p/w count not reset to 0 (kerb)

• Q292573: DSA.MSC and ADSI may not use same DC to WinSERaid:16662 (post SP2 hotfix)

Resolution

• Windows 2000 DC’s: Install SP2 + Q314282

– Same QFE as lingering object and other good DC fixes

• Service Pack 3

Page 146: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

PDC FSMO Load Reduction

Windows 2000 domains are much larger than their NT 4 predecessors

• i.e. > 50,000 clients

NT 4 and WIN9X clients still deployed and target PDC only for updates

Windows 2000 / XP clients use Windows 2000 DCs in mixed mode domains (Q284937)

Older applications select PDC only rather than any DC

Applications may enumerate whole domain ( NT 4 usrmgr, srvmgr )

Result: PDC gets more load

Page 147: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Symptoms of Overload

High CPU utilization for long period

• Greater than 70%

• High average disk queue

– Disk queue > number spindles

• Timeout of requests

– Password changes

Page 148: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Steps to Optimize PDC

Optimize hardware and software

Hide PDC from DNS clients

Implement WINS optimizations

Block down-level enumeration

PDC in dummy site

Page 149: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Optimize Hardware & Software

Run Windows 2000 Advance Server with /3gb switch

• Enables ESE cache of 1.5 gb

4 Processor Server is optimal

2 Gb RAM

Disk

• RAID 1 set for OS and Page File

• RAID 1 set for Log Files

• RAID 0+1 for NTDS.DIT and sysvol

Run only core DC services

Page 150: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Disk

• RAID 1 set for OS and Page File

• RAID 1 set for Log Files

• RAID 0+1 for NTDS.DIT and sysvol

Run only core DC services

Page 151: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Hiding Techniques (DNS)

Lower PDC SRV Priority

• Reduce chance of DS aware clients selecting PDC before other DCs

• HKLM\System\CurrentControlSet\Services\Netlogon\Parameters\LdapSrvPriority=1000

• Data type: Reg_DWORD

PDC only Site

• Clients will use it only as last resort

• Create a site-link to real site

Disable AutoSite Coverge on PDC

• HKLM\System\CurrentControlSet\Services\Netlogon\Parameters\AutoSiteCoverage=0

Page 152: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Hiding Techniques (WINS)

Down-level clients locate DCs through 1C queries

WINS always adds PDC first in 1C list

Remove PDC from top of list (SP2) Q269424

– HKLM\System\CCS\Services\WINS\Parameters

– Value name: Add1Bto1CQueries

– Data type: Reg_DWORD

– Value data: 0 = disabled, 1 = Enabled (default)

Randomize 1C list for general load balancing

– HKLM\System\CCS\Services\WINS\Parameters

– Value name: Randomize1cList

– Data type: Reg_DWORD

– Value data: 0 = disabled, 1 = Enabled

– Q231305 (NT4 SP4 and later)

Page 153: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Block Enumeration

Old (non DS enabled) applications often call SAM APIs to enumerate entire domain

Hard to control

Block unauthorized users from seeing more than 100 objects per call

• New access control right determines access• HKLM\System\CCS\Control\Lsa\SamDoExtendedEnumerationAccessCheck=1

• Q268339 

Page 154: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Misc. – Server Applications

Server based applications can create frequent changes in the directory

• Agent based systems

– Create and delete accounts

– Grant accounts rights in the domain

Changes create replication

• AD replication for frequent group changes

• FRS changes for policy changes

Apply SMS hot fixes

• Q311127, Q278345

• Read articles, configuration necessary

Page 155: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Distributed Link Tracking

Purpose

• Used to track moves of linked files across volumes and servers (shell shortcuts)

• Uses AD objects to track files and volumes

Objects stored in DS

• linkTrackVolentry object for each NTFS volume in the domain

• linkTrackOMTEntry created for each linked item that is moved

• Clients query service when a shell shortcut or OLE link can’t be resolved

Clients refresh links every 30 days

DCs scavenge objects older than 90 days

Page 156: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Distributed Link Tracking

DLT is an optional service

• Enabled by default

Typically not included in DS capacity planning

Best Practices

• Disable on all DCs

– Reduces AD replication traffic

– Reduces AD database size

• Use Group Policy to disable DLT server service on DCs

• Remove objects from DS

– Use staggered approach

• Q312403

Page 157: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DC/GC Promotion Consideration

DC Promotion / Demotion

Process to cleanup after failed promotion

GC Promotion

GC Demotion

Page 158: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

DC Promotion / Demotion

Create proper sites before hand

Failed promotion or removing server

• Manually clean out metadata from any failed attempt

– When replacing a failed DC

– When a DCPROMO has failed

– To clean meta data

– Use NTDSUTIL

– FRS member / subscriber objects

– Machine account in domain

• Allow replication to all DCs before promoting again

Page 159: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

GC Promotion

First GC in site may go online before all partitions are replicated

• Default: GC will advertise after all partitions in site replicate

• Exchange may use GC before ready

• Mail may bounce

Best Practice

• Stop Netlogon

• Mark DC as GC

• Use repadmin to monitor success

• Start Netlogon all NCs replicated

SP3 will wait for all partitions to replicate before advertising

Page 160: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

GC Demotion

GC removal requires time for object removal

The KCC removes 500 objects per default 15 min cycle

Best Practice

• Monitor for event 1069 to record progress

• Forced GC removal when needed (Q297935)

– Remove each partition with repadmin

– repadmin /delete DC=globalit,DC=unity,DC=com %destgc% /nosource

Page 161: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Container Inheritable ACE’s

ACE that applies to either all objects or objects of a specific class in a container

• Example: Delegate right to reset user passwords in one OU

Security Descriptor propagation copies ACE to all objects

• Makes access check very fast

– All information is on directory object

• Also class specific ACEs are copied to all objects

– Example: ACE used to delegate right to reset user passwords also copied to computer and container objects

Increases object size – database size

• Increase proportional to size of subtree

– If set on domain root: Highest impact

– If set on OU: Lower impact (depends on number of objects in OU)

• Low impact if set on schema or configuration container

SD propagation is asynchronous

• Takes time to propagate (i.e., 3 hours in 50,000 user domain)

Page 162: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Container Inheritable ACEsBest Practices

Don’t add container inheritable ACEs to domain root

Add on OUs as appropriate

• Best Practice Documentation recommends OUs for– Users

– Groups

– Computers

• Container inheritable ACEs on these OUs have small impact only

Watch SD propagator events

• SD propagation running: 1257 (Level 2)

• SD propagation report (objects touched): 1258 (Level 2)

• SD propagation terminated abnormally: 1262 (Level 0)

Always leave sufficient disk space on database partition

• 20% of database size, at least 500 MB

• Monitor!

Test ACL changes in lab or pilot domain to bracket size increase

Page 163: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Container Inheritable ACEsThe Future

Windows .NET will have single-instance store for Security Descriptors

• Objects have links to security descriptors

• If container inheritable ACE changes, only one SD changes– No impact on disk size

Does not require .NET only forest

• SD propagation happens on local DC

• Transparent to other DCs

• Feature available immediately

Monitor SD prop events after upgrading a DC

• SD propagator will build single instance store after the domain controller boots .NET for the first time

Database will shrink after OS upgrade

• Need to off-line defrag database to see changes

Page 164: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Forest Recovery

Imagine the unthinkable

• All domain controllers crash and won’t reboot

• Data corruption replicates through the forest

• Schema becomes unavailable

• Somebody made changes to the schema that prevent standard applications from installing

• Malicious administrator performs irreparable damage to the schema that replicates through the forest

• You lose your root domain

• You win the lottery

So far, this has never happened

• But you want to be prepared

Page 165: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Forest RecoveryRolling back in time

TimeTime

ChangesChanges

CatastrophicCatastrophicEventEvent

BackupBackup

BackupBackup

BackupBackup

BackupBackup

BackupBackup

BackupBackup

BackupBackup

BackupBackup

BackupBackup

BackupBackup

Restore –Restore –Changes lostChanges lost

IdentifiedIdentifiedRoot CauseRoot Cause

Page 166: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Forest Business RecoveryHigh Level Steps

Shutdown all domain controllers in forest

In each domain

• Restore one DC from good backup tape

• Re-install OS on all other domain controllers

• Re-promote all other domain controllers

Start with root domain first

Page 167: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Forest Recovery

Shutdown all DCs

Restore one DC per domain (off-network)

Break replication Seize FSMO roles

Disable GC service

Increase RID by 100,000Bring restored DCs back on the network

Enable GC on at least one root DC

Page 168: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Forest Recovery

Re-install OS on all other DCsRe-install OS on all other DCs

Promote all other DCsPromote all other DCs

Enable GC service as neededEnable GC service as needed

Move FSMOs as neededMove FSMOs as needed

Page 169: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Forest Recovery

Detailed steps available very soon in white paper on microsoft.com

• Best Practice for Recovering your Active Directory Forest

Page 170: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS Concepts revisited

Objects in DS

• Members, Subscribers, Conn. objects, filters

• Depends on AD replication

• Determines partners and schedule

NTFS USN Journal

• Used by FRS to track changes to NTFS volumes

Staging File and Directory

• Rename safe

• Compression support

Database

• Record of incoming, outgoing & existing files

Page 171: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS Replication Operation

Create / Modify file

NTFS Drive

Write OB LogWrite entry in FRS ID Table

Request change Write to Inbound and ID log

Filter out unwanted filesAge Cache waits 3s

Build staging file Replica copies file to staging dir Write to OB log for other replicas

Copy file into Pre-install area

Rename + move file to final location

Send change order to partner

NTFS Drive FRS learns of file changes from

the NTFS “USN Change journal”

Page 172: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Journal Wraps / Staging backlog

NTFS USN Journal is a fixed-size log of file changes

• FRS Service must run to keep up with these changes

• Last ∆ in FRS DB must exist in NTFS journal

– If not, FRS cannot know all changes. Called ‘journal wrap’

• Resolution

– Keep Service running (especially during bulk modifications)

– Increase size of USN journal (automatic in SP3 rollup)

Staging File backlog

• Before SP3, staging files stored until all direct partners receive the staged files

– Associated with connections

• Common causes of backlogs:

– Offline downstream partners

– Full SYNCS by Administrators or applications

– Antivirus , Disk Optimizers, File system policy

• Sharing violations / Move-In problems

Page 173: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Reconcilation & Morphed Directories

Files: Last-writer wins

• All change orders have event times (UTC)

• Event time of CO compared to ID Table

– Event time > 30 minutes, last writer wins

– Event time < 30 minutes, highest version wins

Folders: Last-writer wins

• Conflicting change gets morphed name

– Preserves files associated with directory

– First-writer wins for name conflicts of folders

• Causes

– BURFLAGS abuse

– Conflicting creates on replication failure

Page 174: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS Enhancements (Q319473) QFE roll-up of coming Service Pack 3 changes

Increases NTFS USN journal: 128 MB

Dynamic staging file relocation

LRU staging files deleted: 60 / 90 rule

Staging files for offline partners deleted

SYSTEM = Full Control / NTFS bug

Duplicate changes not sent on wire + event

Office XP (Excel) data deletion fix

Page 175: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Topology Enhancements

DFSGUI from .NET Server• Runs on XP clients in Windows 2000 domains• Available on microsoft.com now: Q304718

New topology options• Full Mesh, Ring, Simple Hub & Spoke• Custom Topologies• Connection Tuning

– Enable / disable individual connections – Change orders are associated with connections

– Disabling connections deletes associated backlog

Connection Priority (may pull this)• Bit on options attribute of connection object• Defines partners used during initial / recovery sync

– High: “Must” source all connections in class– Medium: Source from at least 1 connection in class– Low: “best effort” sync

Page 176: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS best practices

Run Q307319 + new NTFS.SYSKeep service running• Avoids journal wrapsJoin empty replica setsDon’t place DFS targets on OS partitionDFS: enable replication on child links• Targets can be taken offline• Incremental sourcing & advertisement of data• Replica set specific burflags Properly size staging dir• 128 largest files + 50% or 650 MB minimumDon’t delete files from staging directory• Change orders, # of VV joins, file size

Page 177: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

FRS best practices

Topology management

• No full mesh

• SYSVOL: requires 1 in / outbound CO

Forceful deletion of FRS members

• Delete member and subscriber objects

Page 178: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Tools

NTFRSUTL

• NTFRSUTL DS

– Repadmin /showconn for FRS

– DS Object inventory + topology review

• NTFRSUTL SETS

– Repadmin showreps for FRS

– Status of downstream partner sync status

• NTFRSUTL INLOG | OUTLOG: IDTABLE

– Inbound + outbound changes + tree inventory

Debug Logs: systemroot%\debug\ntfrs_*.log

• Two way conversation between partners

Page 179: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

Summary

All deployments should run SP2

Deploy SP3 when available

Q314282 provides roll-up fix for many issues

• Lingering objects

• Account lockouts

• PDC overload situations

Monitor Active Directory

Page 180: Advanced Active Directory Design and Troubleshooting Ed Whittington Principal Software Engineer

New Documentation

Available on microsoft.com

• Best Practices for Active Directory Delegation– http://www.microsoft.com/windows2000/techinfo/planning/activedirectory/addeladmin.asp

Coming soon

• Active Directory Monitoring Guidelines and Key Indicators

• Active Directory Forest Recovery

Eventcomb

– http://download.microsoft.com/download/win2000adserv/secops/RTM/NT5/EN-US/SecOps.exe