Transcript Document

Office 365 internet connection
planning and troubleshooting
[Speaker]
[Title]
[Company]
Agenda
Unparalleled experience with datacenters
Huge Microsoft investments in infrastructure
Microsoft has invested $15 billion in infrastructure,
building over 100 datacenters and we are
constantly evaluating new locations
Our high-performing network is one of the
top 3 in the world with public peering in 23
countries with 2,000 ISPs.
Our Datacenters support over 20 Million
businesses and over 200 Online Services. Office
365 is sold in 131 markets, 43 languages, and 25
currencies.
Office 365
DC locations
Global Scale: DC growth
Other Microsoft
DC locations
Microsoft has datacenter capacity around the world…and we’re growing
Quincy
Cheyenne
Dublin
Chicago
Boydton
Amsterdam
Japan
Shanghai
Des Moines
Hong Kong
San Antonio
Singapore
Brazil
35+ factors in site selection:
 Proximity to customers
 Energy, Fiber Infrastructure
 Skilled workforce
Australia
 1+ million servers
 100+ datacenters
in 40+ countries
Growing networks to cloud-scale
Geo-Redundant
Service/Application Design
Top 3 Most Connected
Networks in the World
DC-to-Internet Backbone
Network Device Count Growth
FY10
FY11
FY12
FY13
FY14
FY15
• Peer with over 2000 ISP’s globally
• Multiple Terabits,
• Over 50 Points of Presence globally
• Global backbone connecting MS Datacenter to the
Internet
DC-to-DC Backbone
• Multiple Terabits of Capacity
• Dark fiber based DC-DC backbone to enable high
bandwidth between Datacenters
Dark Fiber
• Tens of thousands of Route Miles of owned Dark
Fiber Backbone
• Million+ 10G DWDM Route Miles of capacity
deployed
Cache Node
• Hosting Services collocated at User location
(metro)
Edge Nodes
• Multiple Terabits of Edge Interconnect capacity
• Directly connected to more than 2000 networks
with over 4,000 connections
Decoupled DCs
FY09
• All nodes active, all nodes stateless
IT Capacity Unit = STAMP
• Separation of CPU’s Storage, SQL Services
• DC Capacity Unit or Workload Appliance
Office 365 Microsoft Edge is live in 14 locations
The green circles
represent Microsoft Edge
nodes live for the Office
365 Portal.
The yellow circles
represent other
Microsoft Edge nodes
Located at the Bing CDNs around the world to
optimize access to Office 365 Services
Edge TCP Improvement
EDGE
TCP
connect
EDGE
SSL
connect
EDGE
San Antonio DC
EDGE
TTFB
Reusing existing
connections
Edge reuses connections between
to further improve performance
TTLB
Without Edge, entire
request over ISP’s network
Content
Time
EDGE
With Edge, Microsoft’s network
brought closer to the user
Client RTT
Server RTT
App
Latency
Network Peering
•
•
•
Microsoft has more than 50
connection points to the
Internet in 23 countries with
peering agreements with
over 2000 ISPs
Peering points are listed at:
http://www.peeringdb.com/view.php?asn=8075
ISPs and Network Operators
are invited to peer for
routing
http://microsoft.com/peering
Rich Clients for Sync and Offline Caching
Outlook, Outlook Web Access, OneDrive for Business
Native clients on
tablets, PCs,
& desktops
Native clients
on mobile devices
Browser-based
Clients also cache
with HTML5
Browser-based
mobile clients
Office 365 offers a wide variety of options across devices for customers to access the service
OWA uses HTML 5 Offline Application Caching if enabled in Offline Settings
Access to Microsoft datacenters
Server
workloads
Microsoft
network
Microsoft
edge nodes
Internet
peering and
routing
Content
delivery
network
Customer
Internet
connection
Rich client
applications
Agenda
Methods for accurate bandwidth prediction
Measure current bandwidth usage with on-premises servers,
then figure changes due to Office 365
• Use Netmon on clients to baseline Outlook, Lync
• Use Perfmon on Exchange servers to baseline clients
• Use counters on edge to baseline SMTP
-orUse pilot to validate baselines or establish them
• Your pilot users will be indicative of velocity
• Linear growth for clients, bell curve for SMTP
• Use test mailboxes to baseline mailbox migrations
Understand how bandwidth is used
Exchange Online
• Similar to on-prem
SharePoint Online
• Estimates rely upon on-prem baselines
views of webpages, uploads/downloads of content
•• HTTPS
Cached mode
Lync
Online reduces impact and provides for latency tolerance (<325 mSec)
•• Document
Perfmon, Netmon,
top,with
etc. to
baseline
editing
Office
Web Apps or Office
•• Definite
IM is bursty,
latency
tolerant,
and
very
small
advantages
to
Outlook
2013
SP1
Use
your
pilot
to
predict
new
requirements
•• Bursty,
butRTAudio
latency tolerant
Voice
uses
• Exchange Bandwidth Calculator
Without
no
real
way toto
estimate
50kbpsbaseline,
low, 80kbps
high,
autodetermination
•• http://gallery.technet.microsoft.com/Exchange-Client-Network-8af1bf00
Assumes
enough
time
order upgrades if
Video depends
upon resolution
Netmon
to baseline
needed
• 280kbps
low, as
4000kbps
dependent
upon resolution
Will
ramp up
morehigh,
content
is loaded
into SharePoint, MySites,
• Desktop sharing depends upon desktop resolution
•• OneDrive
Start
with
at least 20% head room
for Business
Peer-to-Peer versus Client-Server
••
•
•
• Evaluate
options to conserve bandwidth
• http://www.microsoft.com/en-us/download/details.aspx?id=19011
• Lync Bandwidth Calculator
Other bandwidth uses with Office 365
Migrations
•
After hours test mailbox
migrations to baseline
•
•
•
Bandwidth consumed
Average migration rate
For a migration event, the
total bandwidth
consumed is X
X>(MB+MB1+…MBx)
•
Will go away at end of
migrations
•
HTTPS, as inbound
requests (downloads from
the hybrids)
SMTP Traffic



Baseline SMTP in and out
at the existing edge
Production will increase
 2*baseline in +
2*baseline out
During coexistence,
between on-premesis and
Office 365 will add
additional
Administration

Traffic is negligible




HTTPS traffic
Latency tolerant
Bursty and intermittent
DirSync noticeable only
on first run





HTTPS traffic
Only deltas after initial sync
Every 3 hours (+ runtime)
Variable based on deltas
More during the day than at
night
Bandwidth
Planning Recommendations
Conservation Strategies
1.
Consider your current growth rates
1.
Evaluate remote offices
2.
Based on your current utilization, order larger
circuits
2.
Order circuit upgrades or new egress if
necessary
3.
Gather counters from edge
4.
Use calculators (if you have baselines)
5.
Test migrations for bandwidth use
20% headroom is small. 33% is a more common. 50% is
conservative.
Buy lower CIR with higher burst to accommodate needs
3.
Use your 50GB mailboxes, not your archives
4.
Use Outlook 2013 SP1 to take advantage of MAPI
over HTTP and avoid RPC
6.
Evaluate performance during pilot
5.
Move files to SharePoint Online
7.
Pilot impact negligible
8.
Move email MX record to Office 365
6.
Pilot Lync video before use
9.
7.
Any users who cannot use modern clients in cached
mode should use OWA
Limiting streaming media will reduce
bandwidth requirements
8.
Ensure best use of Internet egress and DNS
Other configuration topics
WAN accelerators
Some customer scenarios have seen improved performance
Never a silver bullet and other measures should be investigated
Required to be disabled for debugging or support
Firewall IP address exceptions and URLs
IP Addresses not as quickly updated
Proxies
PAC files, CONNECT, and are they helping or hurting
Agenda
Troubleshooting performance issues
 Be on site to start testing if possible






Expect 2 days of troubleshooting
Meet with the network team and ask the topology
Have engineers on standby at other customer network locations
There are nuances you learn on site
You can do multiple tests simultaneously if on site
Network topology is easier to visualize when on site
 Identify the problem scenario
 Identify the customer topology and network service endpoints
 Collect data and test for each potential problem






NAT, ProMeasuring Latency/Round Trip Time
TCP Window Scaling
GEO DNS issues
Proxy and Firewall port exhaustion
Packet Loss
Routing and Peering






TCP Idle time settings
Proxy Authentication
DNS performance
SACK and TCP MSS
Lync tests to Server
SharePoint customization performance
Identify the problem scenario
 Customer Scenario Questions








Test for impact to Office 365 client applications
Identify locations where users are impacted
Clarify if all users or just some are impacted
Did performance recently get much worse?
Has anything changed recently or more users added?
Do the users have the same issue at home?
Are non-Office 365 services performing poorly?
If not on site then capture a video of the problem PSR.EXE
 Customer Topology Background
 Identify the ISP and the Internet connection type
 Get internal network and Internet proxy details
Identify the network service endpoints
 Exchange Online
 Dependent on tenant country and user location
 Use nslookup -type=A outlook.office365.com
 SharePoint Online
 Dependent on tenant country
 Use nslookup -type=A tenantname.sharepoint.com
 Lync Online
 Paired pools for each region by tenant country
 Amsterdam and Dublin
 Gainesville, VA and San Antonio, TX
 Hong Kong and Singapore
 Use the Lync test tool
Measuring Latency (RTT)
 Utilize a Microsoft, Sysinternals tool, PSPing
 Creates a TCP session to a port and IP address supplied which
works round any port blocking issues.
 If using a proxy, we need to measure to it’s address.
 This gives us an RTT to the perimeter of the customer’s network
which gives us a view if there is a problem inside the customer’s
network.
 We then need to use a network trace or psping taken on the
proxy to measure the remaining RTT from perimeter to O365
 If a direct connection is available we can use this to measure RTT
all the way to Office 365
PSPing Demo
Putting it all together
Client
Proxy
Office 365 Datacentre
54.88ms
0.346ms
Internal RTT (ms)
External RTT (ms)
Total RTT to O365
54.88
346
400.88
Here we can see clearly, the poor RTT is outside the customer’s environment, on the ISP link to Office 365. If this
RTT is unexpected, the customer can engage their ISP to investigate.
TCP Window Scaling
TCP data packets
TCP ACK
TCP Window Scaling
enabled?
Maximum TCP receive
buffer (Bytes)
No
65535 (65k)
Yes
1073725440 (1gb)
Impact of TCP Window Scaling
Presuming a 1000 Mbps link here is the maximum throughput we can get with TCP window scaling disabled
and then with it enabled
Round Trip Time (ms)
Maximum Throughput
(Mbit/sec) without scaling
300
1.71
447.36
200
2.56
655.32
100
5.12
1310.64
50
10.24
2684.16
25
20.48
5368.32
10
51.20
13420.80
5
102.40
26841.60
1
512.00
134208.00
Maximum Throughput
(Mbit/sec) with scaling
Impact of enabling this setting
Before & After correctly enabling TCP Window Scaling- Download a 14mb PDF
600.0
500.0
507.0
Seconds
400.0
300.0
200.0
100.0
21.0
0.0
Australia Proxy (Incorrect Windows Scaling settings)
Australia Proxy TCP Window Scaling enabled
Office 365
Office 365
Australia PC
Australia PC
Australian PC downloading a large PDF before and after correctly setting the TCP Window Scaling on the Proxy
TCP Window Scaling




7692
12:28:03
14/03/2014
12:28:03.8450000
0.0000000
100.8450000
10.127.0.199
contoso47-48ipv4b.sharepointonline.com.akadns.net
TCP
TCP: [Bad CheckSum]Flags=......S.,
SrcPort=43511, DstPort=HTTPS(443), PayloadLen=0, Seq=3807440828, Ack=0, Win=65535 ( Negotiating scale factor 0x0 ) =
65535
7740
12:28:04
14/03/2014
12:28:04.1440000
0.2990000
101.1440000
contoso4748ipv4b.sharepointonline.com.akadns.net
10.127.0.199
TCP
TCP:Flags=...A..S., SrcPort=HTTPS(443),
DstPort=43511, PayloadLen=0, Seq=3293427307, Ack=3807440829, Win=4380 ( Negotiated scale factor 0x2 ) = 17520
7692
12:28:03
14/03/2014
12:28:03.8450000
0.0000000
100.8450000
10
.127.0.199
contoso47-48ipv4b.sharepointonline.com.akadns.net
TCP
:
[Bad CheckSum]Flags=......S., SrcPort=43511, DstPort=HTTPS(443), PayloadLen=0, Seq=3807440828, Ack=0,
Win=65535 ( Negotiating scale factor 0x0 ) = 65535 {TCP:818, IPv4:122}
7740
12:28:04
14/03/2014
12:28:04.1440000
0.2990000
101.1440000
co
ntoso47-48ipv4b.sharepointonline.com.akadns.net
10.127.0.199
TCP
TCP:Flags=...A..S.,
SrcPort=HTTPS(443), DstPort=43511, PayloadLen=0, Seq=3293427307, Ack=3807440829, Win=4380 ( Negotiated
scale factor 0x2 ) = 17520
{TCP:818, IPv4:122}
GEO DNS issues
 Content Delivery Networks work
on GEO DNS
Internet
egress point
 Exchange Online uses GEO DNS
 You get a different IP Address from
the DNS depending on where in
the world you request it
 Impacts a multi-country corporate
network with multiple Internet
connection points
 Commonly DNS is only requested
at one point and cached
Microsoft network
Customer network
 You can get DNS from another part
of the globe to where you have
Internet connectivity
Improving peering

You can see they are not peering with Microsoft in Australia

Evaluate performance
 Perth to Singapore by fiber optic cable is 4,600 kms and requires approx. 50 repeaters
 This should take 23 mS one way (46 mS Round Trip Time)
 In the example we saw approx. 55 mS

In the example Telstra is providing good performance

With poor performance you would ask them to work on their routing or peering

Microsoft public peering points are in 23 countries and documented at
http://www.peeringdb.com/view.php?asn=8075

ISPs and network operators can enquire at
http://www.microsoft.com/en-us/peering/default.aspx

Microsoft has data on ISP performance and we are discussing
performance with ISPs who are most used but poorly
performing
Test for routing and peering
 TraceRt to an Office 365 service network endpoint
 tracert -4 outlook.office365.com
 Look for hosts called
NTWK.MSN.NET which is the
Microsoft global network
 In the example Microsoft corporate is
peered Microsoft public in co2-96c1a.
 The first two letters indicate the
location.
 This is at the Colombia river site in
Quincey, WA
Network peering locations
Brisbane
Australia
Kuala Lumpur
Malaysia
Los Angeles
USA
Melbourne
Australia
Amsterdam
Netherlands
Miami
USA
Perth
Australia
Auckland
New Zealand
New York
USA
Sydney
Australia
Wellington
New Zealand
Palo Alto
USA
Vienna
Austria
Moscow
Russia
San Jose
USA
Luxembourg
Belgium
Singapore
Singapore
Seattle
USA
Sao Paulo
Brazil
Stockholm
Sweden
Montreal
Canada
Zurich
Switzerland
Toronto
Canada
Taipei
Taiwan
Prague
Czechoslovakia
London
UK
Paris
France
Ashburn
USA
Frankfurt
Germany
Atlanta
USA
Hong Kong
Hong Kong
Boston
USA
Dublin
Ireland
Chicago
USA
Milan
Italy
Dallas
USA
Turin
Italy
Denver
USA
Tokyo
Japan
Honolulu
USA
Seoul
Korea
Las Vegas
USA
•
•
•
•
•
•
Site data is published at
http://www.peeringdb.com/view.ph
p?asn=8075
Some cities have multiple peering
points
Peering agreements with 2,000+
ISPs
Peering locations may be on-net
or off-net
Peering may involve physical
connection and/or routing
advertisements
Data as of July 2014 is subject to
change
Network Routing Demo
SharePoint Online Customization Performance
Guidance and Throttling
•
•
•
•
Overuse “Throttling” investments coming soon to ISV environments.
Compare performance with the OneDrive for Business home page which is rarely customized
If it performs well, but other pages do not, then you likely have customization issues
It is still too easy for customers to unknowingly hurt their own performance with web parts, out-ofbox-features, and basic customizations.
• Those features are important, but “guide rails” are being added
• New “O365 Specific” guidance is starting to come out
• Starting with SPC-3993 (http://channel9.msdn.com/events/SharePoint-Conference/2014/SPC3993
NAT and Firewall Port exhaustion
 For companies that have existing Internet
connectivity but do not make extensive
use of SaaS applications
 Firewalls have limited port mapping
and SaaS use will require more than
Internet browsing
 The primary symptom of this are that some users
will see Outlook in disconnected state
Unsupported client software
 The current or immediately previous version of Internet
Explorer or Firefox, or the latest version of Chrome or
Safari
 Any version of Microsoft Office in mainstream support
 Office 365 ProPlus updates automatically, but you can
delay updates up to 12 months and be fully supported
 Windows XP isn’t supported
 Old web browsers such as Internet Explorer 8 may work
but will be slower than modern browsers
 See system requirements document on TechNet
http://technet.microsoft.com/en-us/library/office-365-system-requirements.aspx
Troubleshooting for Lync
 Lync is peer to peer for two person calls
 Tests to the server support multiperson meetings
 Requires Java ActiveX control enabled
 Requires URLs added to Java Control
Panel Applet in security panel
 Test both server locations for your
region for the paired pool






Amsterdam, NL http://trippams.online.lync.com
Dublin, Ireland http://trippdb3.online.lync.com
Hong Kong http://tripphkn.online.lync.com
Gainesville, VA http://trippbl2.online.lync.com
San Antonio, TX http://trippsn2.online.lync.com
Singapore http://trippsg1.online.lync.com
 Detailed results for Lync use
Agenda
Service performance
 As we build new features to delight customers, we are also fine tuning
all the applications like Office, Email, Collaboration & Communication to
respond faster to improve overall customer experience
 As an example, we have made the improvements in the area of
performance:
Improved reload
times with AJAX
clients and
modern browsers
Use of CDNs for
static public files
Disk I/O
improvements
Shredded storage
Minimal Download
Strategy
Rich Client Applications
•
•
•
Our Office clients including
Outlook and SkyDrive Pro
sync enable fabulous
experience while any data
upload or download
happens in the background
These rich applications
enable great experience
even in offline conditions
such as air travel
Similar native rich clients are
available for desktop,
laptop, Windows Phone,
iOS and Android
Browsers
•
•
•
AJAX (or Asynchronous
Java script and XML)
clients help us deliver
good performance in
browser based clients
In addition, use of
modern browsers
enable greater
customer experience
Even Outlook Web
Access, a browser
based client enables
Offline access to help in
offline situations such as
Air travel
Microsoft Edge initiative
 With the pilot of Edge initiative customers in Singapore saw a
20% improvement in logging into the service
 Various functionalities in Office 365 will start improving using
Edge nodes
 The Office 365 Portal is rolling out to Microsoft Edge in Q3
CY2014
* These Edge improvements do not apply for Galatin
What you can do