Exchange Storage for Insiders

Download Report

Transcript Exchange Storage for Insiders

16,000
HDD Capacity and Areal Density Forecast 2012-2017
3.5 HDD Capacity in GB
Areal Density required @ 7 platters
Areal Density GB sq” (PMR)
12,000
10,000
6,000
4,000
620
2012
5,000
700
2013
820
2014
1,429
2,000
920
1,050
2015
2016
Perpendicular Magnetic
Recording (PMR)
• Perpendicular magnetic recording (PMR)
technology is the current HDD technology
• PMR has reached the upper plateau (1.2TB
/platter). PMR assist technologies will be
necessary to sustain a 15–20% year-over-year
production areal density growth rate
2,286
1,200
2017
SMR & HAMR intended to
close the Areal Density gap
Shingle Magnetic Recording
(SMR )
• Shingled Magnetic Recording is an Areal
Density “assist” technology which extends
capacity (30%) of existing PMR head
• Breaks performance characteristics of
traditional Databases (ESE/SQL). -80% IOPS
based on Jetstress Test .
Heat Assisted Magnetic
Recording (HAMR)
• Manufacturing complexities mean HAMR is
still a ways out
In a shingled write, the data tracks are written in a particular direction radially, and are only written once with a
write head wider than the track pitch
head
head
motion
motion
write
corner
head
head
cross
track
progressive
writes
scans
Track Pitch
 Conventional Recording uses a track
pitch that keeps data separate.
Track Pitch
down track
 Shingle Write Recording overlaps tracks, allowing for a narrower track
pitch. Each track is only written once until any remaining valid data is
moved.
DATA-AT-REST PROTECTION
POWER CONSUMPTION
http://research.microsoft.com/~ranveer/docs/email-energy.pdf
 Larger but not Faster Drives
 Exchange 2013 reduces IOPS by +50% compared with Exchange 2010. Supports multiple databases
per volume to maximize available IOPS
 JBOD still best for COGS (capacity + performance + cost)
 Continue riding IOPS/Capacity curve up to 10TB (PMR) Drives
 Continue riding IOPS/Capacity curve up to 10TB (PMR) Drives
 SMR Drives not supported with Exchange 2013
 Data-at-rest Protection
 Utilize Bitlocker for Data-at-rest protection with Exchange 2013
 Office365 is already utilizing Bitlocker. From http://trustoffice365.com
 “All email content is encrypted on disk using BitLocker Advanced Encryption Standard (AES) encryption. Protection covers all
disks on mailbox servers and includes mailbox database files, mailbox transaction log files, search content index files,
transport database files, transport transaction log files”
 Self-Encrypting-Drives (SED) promising but not viable in Enterprise scenarios due to the requirement
of Drives being directly attached to ATA channel.
 Power Consumption
 Don’t ignore benefit of power efficient Storage technologies.
 Helium technology very promising for both Areal Density increase and Power consumption decrease (2 watts compared to conventional drive)
Exchange 2013 SP1 ESE Enhancements
Tasks
Keywords
Error
Performance
Trace
Transaction
Space
BF
IO
LOG
Task
BFRESMGR
JETTraceTag
+30% improvement
 Tool Enhancements
• ESEUTIL verbose output options
• Log Dump, Space Report, Table/Page/Node Dump
ESE_Trace
ESE_BF_Trace
ESE_Block_Trace
ESE_NewPage_Trace
ESE_ReadPage_Trace
ESE_PrereadPage_Trace
ESE_WritePage_Trace
ESE_EvictPage_Trace
ESE_TouchPage_Trace
ESE_LatchPage_Trace
ESE_DirtyPage_Trace
ESE_TransactionBegin_Trace
ESE_TransactionCommit_Trace
ESE_TransactionRollback_Trace
ESE_AllocExt_Trace
ESE_FreeExt_Trace
ESE_AllocPage_Trace
ESE_FreePage_Trace
ESE_IOREQHeapEnqueue_Trace
ESE_IOREQHeapDequeue_Trace
ESE_IOCompletion_Trace
ESE_LogStall_Trace
ESE_LogFlush_Trace
ESE_EventLogInfo_Trace
ESE_EventLogWarn_Trace
ESE_EventLogError_Trace
ESE_TimerQueueSchedule_Trace
ESE_TimerQueueRun_Trace
ESE_TimerQueueCancel_Trace
ESE_TimerTaskSchedule_Trace
ESE_TimerTaskRun_Trace
ESE_TimerTaskCancel_Trace
ESE_TaskManagerPost_Trace
ESE_TaskManagerRun_Trace
ESE_GPTaskManagerPost_Trace
ESE_GPTaskManagerRun_Trace
ESE_ThreadCreate_Trace
ESE_ThreadStart_Trace
ESE_VersionPage_Trace
ESE_VersionCopyPage_Trace
ESE_CacheResize_Trace
ESE_CacheLimitResize_Trace
ESE_CacheScavengeProgress_Trace
ESE_ApiCall_Trace
ESE_ResMgrInit_Trace
ESE_ResMgrTerm_Trace
ESE_CachePage_Trace
ESE_MarkPageAsSuperCold_Trace
ESE ETW Tracing in Sp1
http://aka.ms/Jetstress2013
Storage Technologies and impact on Exchange
Future Exchange ESE Enhancements
http://blogs.msdn.com/b/jet/
AD Driver
XSO
XSO
MAPI.Net/
ExRPC
XSO
MAPI.Net/
ExRPC
MAPI.Net/
ExRPC
XSO
Event and
Time-Based
MAPI.Net/ Assistants
ExRPC
XSO
XSO
ActiveSync
AD Driver
OWA
AD Driver
XSO
MAPI.Net/
ExRPC
MAPI.Net/
ExRPC
AD Driver
MAPI.Net/
ExRPC
AD Driver
MAPI.Net/
ExRPC
RPC Client
processes
Search
MOMT
AD Driver
XSO
EWS
AD Driver
XSO
Mailbox
Transport
IMAP
AD Driver
POP
AD Driver
Client-Specific Protocols (POP, IMAP, SMTP, HTTP, EAS, etc)
MAPI.Net/
ExRPC
Local Inter-Process Communication is RPC-based
RopParser/RopHandler
Database
and Process
management
Repl
(Active
Manager)
Store
Service
Process
Store
Common
Services
(Mailbox
State,
Mailbox
Tasks,
Replid Guid
Map,
Extended
Property
Map)
Mapi (context/session/mailbox/folder/message)
LogicalDataModel
PhysicalAccess
Managed ESE
ESE.DLL
Store Worker Process (1 per database)
Directory
Services
Store worker
RPC Server
Processes with
MAPI and
AdminRPC
endpoints
Element
E2007
E2010
E2013
Physical Contiguity
(ESE)
Poor physical contiguity of leaf
pages. Hence many, small size, IOs
(1 for each page)
Excellent physical contiguity of leaf
pages. So fewer, large size IOs,
spanning N pages
Logical Contiguity
(Store)
Headers for each folder kept in
separate table. So many, small size,
IOs spread over many tables
Folder, Message & Attachment table
per mailbox. Message table consists
of physical columns and property
blobs. High message per page
density means fewer large lOs to
retrieve many messages for views
Temporal Contiguity
(Views)
All views and indexes updated each
time a mail is delivered. So many,
small size, IOs spread over time
Views and indexes updated only
when they are accessed by user. So
fewer, large sized, IOs done together.
Tables optimized for sequential I/O
• Mailbox – MailboxNumber, Owner Info, Locale, LastLogonTime, etc
• DeliveredTo – duplicate delivery information
• Events – reliable events for assistants
• Folder - FolderId, Item Count, Size, PropertyBlob
• Message – DocumentId, MessageId, FolderId, PropertyBlob, OffPagePropertyBlob,
MessageClass ordered by DateReceived
• Attachment – AttachmentId, Name, Size, CreationTime, etc
• PhysicalIndexes (partitioned by LogicalIndex)
IOPS reduction through Message Table Property Storage and Compression
• Compression more efficient when input contains more properties
• Blob size limited to eliminate LV tree access for core message properties
• Reading LV tree involves large sequential I/O (some fragmentation)
DB IOPS/Mailbox
+99% *
Reduction!
1
0.8
Exchange 2003
Exchange 2007
0.6
Exchange 2010
0.4
Exchange 2013
0.2
0
Exchange 2003
Exchange 2007
Exchange 2010
Exchange 2013
Server1
Server2
Server3
Server4
DB1 Active
DB1 Passive
DB1 Passive
DB1 Passive
DB2 Passive
DB2 Active
DB2 Passive
DB2 Passive
DB3 Passive
DB3 Passive
DB3 Active
DB3 Passive
DB4 Passive
DB4 Passive
DB4 Passive
DB4 Active
DAG
Exchange Team Blog
DB1, DB2, DB3 schema upgrade on next mount
Server1
Server3
Server2
Server4
DB1 Passive
Active
DB1
DB1 Passive
DB1 Passive
DB1 Passive
DB2 Passive
DB2 Passive
Active
DB2
DB2 Passive
DB2 Passive
DB3 Passive
DB3 Passive
DB3 Passive
Active
DB3 Passive
DB4 Passive
DB4 Passive
DB4 Passive
DB4 Passive
Active
DB4
0.126
0.121
capable
0.126
0.121
capable
DAG
0.126
0.121
capable
0.126
0.121
capable
max supported version increased from
0.121 to 0.126 schema upgrade request for each local
DB copy fails because other servers only support 0.121
max supported version increased from
0.121 to 0.126 schema upgrade request for each local
DB copy fails because other servers only support 0.121
max supported version increased from
0.121 to 0.126 schema upgrade request for each local
DB copy fails because other servers only support 0.121
max supported version increased from
0.121 to 0.126 schema upgrade request for each local
DB copy succeeds because all servers support 0.126
DB4 schema upgrade succeeds
during mount operation
online documentation
• Previously required ESEUTIL /MS on offline DB copy
• Get-MailboxStatistics extended to display physical
table sizes of each mailbox
• Get-MailboxStatistics with identity parameter will provide current stats
• Get-MailboxStatistics with database parameter will provide cached stats
[PS] D:\>Get-MailboxStatistics <MailboxId>| FL *size
Logical Mailbox Size
(TotalItemSize + TotalDeletedItemSize)
TotalDeletedItemSize
TotalItemSize
: 3.545 GB (3,805,959,899 bytes)
: 45.73 GB (49,100,346,075 bytes)
MessageTableTotalSize
MessageTableAvailableSize
AttachmentTableTotalSize
AttachmentTableAvailableSize
OtherTablesTotalSize
OtherTablesAvailableSize
:
:
:
:
:
:
18.02 GB (19,344,031,744 bytes)
31.69 MB (33,226,752 bytes)
8.071 GB (8,665,759,744 bytes)
2.906 MB (3,047,424 bytes)
73.47 MB (77,037,568 bytes)
736 KB (753,664 bytes)
Physical Mailbox Size (MessageTableTotalSize + AttachmentTableTotalSize + OtherTablesTotalSize)
Name
Trigger/Recovery sequence
Database Availability
16 logon failures in 22 minutes  Escalate
Store service not running
2 failures in 12 minutes  Restart service  Bugcheck  Escalate
Database Free space
Free disk space drops below 10%  Escalate
Store service process repeatedly crashing
3 crashes for store service in 1 hour  Escalate
Store worker process repeatedly crashing
3 crashes for store work (across all workers) in 1 hour  Escalate
Percent RPC requests
90% of available threads per database for 10 min  Database Failover  Escalate
70ms RPC latency
70ms RPC Avg latency for 10 min  Determine impact scope  Escalate
150ms RPC latency
150ms RPC Avg latency for 10 min  Determine impact scope  Escalate
Mailbox quarantined
More than 1 mailbox quarantined on database for 10 minutes  Escalate
Assistants service not running
2 failures in 12 minutes  Restart service  Escalate
Event assistants behind watermarks
Assistant watermark age exceeds 1 hour threshold for 4 hours  Escalate
Number of active background tasks
Count of active background tasks exceeds threshold for 15 min  Escalate
Database Repeatedly Mounting
3 database mount attempts in 1 hour  Escalate