Lync Top Support Topics and Troubleshooting Tools

Download Report

Transcript Lync Top Support Topics and Troubleshooting Tools

Lync Performance Monitoring
Centralized Logging Service (CLS)
Lync Media on WiFi
Lync Call Generators
Troubleshooting Tools
Deployment completed, SMEs are gone, the
customer is left to support an ever-changing
software update cycle.
Customers are often not equipped to manage
the complexity of Lync. Many proactive steps can
prevent the most common scenarios that
generate support calls and generally can
potentially leave Lync functionality crippled and
unreliable.
We will discuss the most common issues and
discuss how to prevent them altogether.
We will also discuss troubleshooting tools and
methodology.
Lync Performance Monitoring
System Center Operations Manager (SCOM): SCOM is an alerting system providing data on server status
Performance Counters: Feed into SCOM and for general server performance monitoring. Includes active connections, processing of messages, failures
raised by server, latency
Event Logs: Used to report to SCOM, configuration state on server, security policy update, service availability
Synthetic Transactions: Automated tests to detect outages in service features (e.g. , Instant Messaging [IM], registration, presence)
Call Detail Records (CDR): CDR provides telemetry on usage patterns (e.g., call volume), call establishment (e.g., conference join)
QoE Metrics: media, network, endpoint
and connection metrics collected on
endpoint
QoE Metrics: Media, network, endpoint and connection metrics
collected on endpoint
QoE Metrics: Media, network, endpoint and connection metrics
collected on endpoint
UFD: Actionable notifications displayed to user
UFD: Actionable notifications displayed to user.
Network Bars: Indicator providing users with information when
network performance is causing media quality issues
Network Bars: Indicator providing users with information when
network performance is causing media quality issues
CDR/QoE
SQL Database
Front End Server
Lync Storage Service
Data Collection
Queue DB
Unified Contacts
Archival Processing
(IM, WebConf)
Monitoring Processing
(CDR/QoE)
SQL DB
Replication for
HA
Troubleshooting
In Lync 2013, improved video
metrics are aligned to the new video
feature set
Reports will have both audio and
video media performance analysis
New QoE will enable administrators
to better identify problems with
both audio and video
Planning
QoE provides information on
Network performance and problem
identification
Audio performance issues
Video usage and performance issues
QoE data assists in
Network planning (e.g., wired and
wireless access requirements)
Server and general infrastructure
procurement decisions
Centralized Logging Service
Get-CsClsScenario global/<ScenarioName> |
Select -ExpandProperty Provider |
Format-Table Name,Level,Flags -a
Component Name
Level
MediationServer
Info
S4
Info
Sipstack
Info
TranslationApplication
Info
OutboundRouting
Info
InboundRouting
Info
UserServices
Info
COMMAND Description
-start
Starts trace session for given scenario. Mandatory option: scenario. Other valid option: duration
-stop
Stops trace session for given scenario. Mandatory and only valid option: scenario
-query
Query list of scenarios being traced. Valid options: None
-flush
Flush logs and make them available for searching immediately. Valid options: None
-update
Update the duration active (nondefault) scenario needs to be traced for. Mandatory and only
valid option: duration
-search
Search logs. Results are returned in a text file. Valid options: starttime, endtime, components, uri,
callid, phone, ip, loglevel, matchany, matchall, keepcache, correlationids
-?
Will display command line usage along with scenario names
OPTION
Description
-scenario
Scenario name (Valid scenario names were given earlier)
-duration
Duration (in minutes) to trace the given scenario for. Default duration: 24 hours
-matchall
Specify this to require the search to match all criteria specified
-matchany
Specify this to require the search to match any criteria specified. This is the default.
-starttime
(timestamp) timestamp to search the log entries from
-endtime
(timestamp) timestamp to search the log entries to
-loglevel
(fatal | error | warn | info | verbose | noise)
This is the least severe log level to search on. For example, if 'warn' is specified search will be
limited to 'warn', 'error' and 'fatal'
-components
List of comma separated component names to restrict the search scope
-phone
Phone number scope for search command. This needs to be exact match
-uri
URI scope for search command. This needs to be exact match
-callid
Call id scope for search command. This needs to be exact match
-ip
IP address scope for search command. This needs to be exact match
Lync Media on Wi-Fi
Lync Call Generators
Sign-in and authentication
Public Key Infrastructure (PKI) /
TLS Certificates
Signaling and media
establishment
High availability / disaster
recovery (HA / DR)
Lync address book
Sign-in and Authentication
Lync clients have different requirements because they are limited
by the platform capabilities.
Changes from the legacy client platform have necessitated a
“fallback” approach to client DNS lookup.
Secure connectivity required for passing authentication.
Certificate-based authentication requires obtaining a certificate
via the web services.
Seldom will you see two deployments with the identical
network/infrastructure requirements.
DNS Complexity
Network Infrastructure
Securing External Access
ABS / PIM
Public Key Infrastructure (PKI) /
TLS Certificates
PKI is everywhere in the product.
Correct use of certificates for internal roles, public certs from well known CAs for
external users, PIC, federation, Office 365, mobility, and reverse proxy.
Certificates used for antivirus encryption and authentication are NON-public.
Internal namespaces on external facing certificates are increasingly under scrutiny
because of new PKI standards.
Oauth is a new way to ensure intra-role communication is simplified. Server to
server; prevents trust issues between Lync and other trusted roles.
All connections in Lync use TLS or MTLS with the exception of antivirus
Avoid wildcards in certificate names
Supported as Subject Alternative Name (SAN) on Web Services (RP)
Many public CAs won’t allow a direct import of a certificate request; names are
often added or certs recycled from other modalities because of the cost factor.
Only external services need public CA-issued certs.
No internal namespace on public certificates.
DNS must succeed for proper trust. Edge DNS pointers to internal split domain
namespace.
Scaled Edge servers share identical certificates (private
Transport Layer Security (TLS) is used not only to secure traffic but
also to establish a trusted relationship between SIP proxies.
Secure Real-time Transport Protocol-User Datagram Protocol
(SRTP-UDP) cannot provide TLS with the certificates. However, it
can still scramble a packet payload.
Oauth provides a framework for authorizing components to
interoperate and reduces the trust model management through
certificate replication.
Use wizards for certificate requests
Primary SIP domain = public namespace
No wildcard certificates
Use internal CAs for internal roles and access points
Avoid all-in-one certificates
Signaling and Media
Establishment
Media Relay Authentication Service (MRAS), Interactive Connectivity
Establishment (ICE), Session Description Protocol (SDP) candidates
Edge server as a functional firewall device
Media bypass, hair pinning, mediation
Bandwidth management / Call Admission Control (CAC) / Quality of
service (QoS)
Monitoring / Quality of experience (QOE)
External registrar SIP proxy users and federation
External conference proxy (SIP signaling still traverses Access)
All audio, video, and media sharing using Real-time Transport
Protocol (RTP)
Uses ICE (Session Traversal Utilities for Network Address Translation
(STUN) / Traversal Using Relay NAT (TURN) – secure using MRAS (is
not TLS)
No user services (that’s the reverse proxy role)
HTTPS connection for mobility clients, ABS, Meeting Lobby, etc.
Media Relay Authentication Service (MRAS) - (5062) Internal via SIP
proxy
Allocate (3478) and ‘Are you there ping’ to ensure connectivity?
Open ports on NAT host | Reflective | Relay
Deep packet inspection – XOR
UDP and TCP open port ranges are largely overrated as a security
threat
DNS Load Balancing vs. Hardware Load Balancers
Certificates
TLS everywhere but media exchange.
Internal / external namespace depends on DNS pointing the right direction.
Networks
No logical sub-netting to prevent physical isolation.
Routing to Internet and internal networks should never overlap and will require
manual management of the networks in most cases
High Availability and Disaster
Recovery (HA/DR)
45
Don’t confuse High Availability and Disaster Recovery
Scenarios
No limited functionality
Pool pairing
RPO/RTO - Recovery point objective / Recovery time
objective
Windows Server 2012 with Lync 2013 - known issues with
Windows fabric
All servers hung in “starting” state
Reset -CsPoolRegistrarState -ResetType QuorumLossRecovery -PoolFQDN <FQDN>
Reset-CsPoolRegistrarState -ResetType FullReset -PoolFQDN <FQDN>
Lync Address Book
49
Changes in Active Directory Properties
Pushed to the Lync Back End servers every 60 seconds
Default Setting for Address Book Service =
WebSearchandFileDownload
Get-CsClientPolicy … -AddressBookAvailability
FileDownload in Lync has all the same caveats as R2. Delay in updating,
differential files, 24-hour updates, and so on.
Personal Information Manager (PIM)
Relies on Exchange web services (EWS) to obtain Outlook contacts and also
synchronize Outlook calendar entries with presence state in the database; this is a
client-side process
Unified contact store (UCS)
Introduces a host of potential caveats with contact loss. but relies on FE process to
proxy contact storage to the users mailbox. This is not PIM, but gets access to
Exchange using the same process.
Subscribe to presence
HA/DR real-time presence across all Front End servers and backup registrars
Privacy relationship
Trust with Office 365
Troubleshooting Tools
Questions?
http://www.microsoft.com/en-us/download/details.aspx?id=39968
http://www.microsoft.com/en-us/download/details.aspx?id=36821
http://www.microsoft.com/en-us/download/details.aspx?id=35455
http://www.microsoft.com/en-us/download/details.aspx?id=35453
http://blogs.technet.com/b/nexthop/archive/2012/04/16/troubleshooting-lync-server-2010-withsnooper-part-1.aspx
60