ITIL Overview

Download Report

Transcript ITIL Overview

BMC PATROL Express
Presentation to
the Ottawa Area
PATROL User
Group (OAPUG)
Pierre Vanier, KOAN-IT Corp.
May 5th, 2004
1
About KOAN-IT
• KOAN-IT’s mission:
– To deliver Service Management solutions for IT operations
– Visit us at: www.koan-it.com
• About the presenter:
Pierre Vanier, Senior IT Consultant, KOAN-IT Corp.
– 15 years experience in the IT industry
– 10 years experience in Enterprise Management solutions
2
KOAN-IT and PATROL Express
• PATROL Express used at KOAN-IT to monitor:
– IT Infrastructure
– Corporate web site (i.e. www.koan-it.com)
• Powerful monitoring of both Patrol and non-Patrol environments
• KOAN-IT is now reselling PATROL Express to commercial clients
• Available on a subscription basis (monthly fee)
• For more information, please contact:
KOAN-IT Corp.
email: [email protected]
phone: (613) 591-9131
3
Agenda
• PATROL Express Overview & Architecture
• PATROL Express Features & Demo
• Q&A
4
PATROL Express Overview
What is PATROL Express?
PATROL Express is an infrastructure monitoring solution. It provides
monitoring, notification of outages, and reporting for servers, network
devices and applications.
PATROL Express also monitors the performance and availability of
web transactions. It measures both transactions and infrastructure
against user-defined service level objectives.
PATROL Express uses an agentless technology that enables it to be
deployed rapidly with minimal impact. Its interfaces are web-based;
all management tasks can be performed using a standard web
browser.
5
Sliding Scale of Management
Points of Entry
Enterprise Management
Service Level Management
Service
Management
Root Cause Analysis
PATROL
Recovery Actions
PATROL Express
Extensibility
Infrastructure
Management
Remote Service
Level Monitoring
Up/Down
Detection
Infrastructure
Monitoring
Capabilities
6
Architecture Overview
7
Key Concepts
Service Integration Portal (SIP)
– Web-based application
– Resides at customer’s or service provider’s data center
– Used by end-users to:
•
•
•
•
Configure elements
Organize elements into services
View reports
Set up notifications
8
Key Concepts (cont’d)
Remote Service Monitor (RSM)
–
–
–
–
–
–
Collects performance data and relays it to the SIP
Remotely monitor elements that are configured at the SIP
Agentless technology – uses industry-accepted protocols
Downloaded from the SIP
Installed and deployed on the customer’s network
RSM clustering: provides failover protection
Monitoring Overview
 Supports a number of remote monitoring protocols
 Each RSM can monitor hundreds of elements
 Typically retrieve parameters from elements once a minute
(monitoring interval)
9
RSM Location
– The RSM resides in the customer’s environment
– Must have IP addressability to the elements that it
monitors
– Must also be able to resolve the PATROL Express SIP IP
address
– The RSM runs as a service
– RSM manager application resides in the Microsoft
Windows system tray
10
RSM Manager (system tray)
11
RSM – SIP Communications
• The RSM communicates with the SIP using HTTPS
• The RSM initiates all communication
• The data is compressed and encrypted prior to being sent
• RSM-SIP communications fall into one of the three following
categories:
– Verifying RSM-to-SIP communication
– Forwarding alarm or warning exceptions to the SIP for processing
– Forwarding parameter data to the SIP for processing service reports
12
PATROL Express System Requirements
SIP (Service Integration Portal)
Minimum system requirements for installing all PATROL Express components (web, application and database servers)
on ONE System, are as follows:
Minimum
Resource
Requirement
Comments
Platform
Operating
System
Intel Pentium III or
equivalent
Windows 2000 Server
(SP2 or later)
Minimum of 1 GB memory dedicated to the SIP components; minimum 733MHz processing speed, Minimum 30 GB Hard Drive Capacity
The OS MUST be Windows 2000 Server with all latest Critical Updates and
Patches applied
RSM (Remote Service Monitor)
Resource
Minimum Requirement
Comments
Platform
Intel Pentium III
Minimum of 128 MB memory dedicated to the RSM; minimum 600MHz processing speed
Operating
System
Windows NT 4.0 (SP6a or later),
Windows 2000 (SP2 or later), or
Windows XP
Browser
Internet Explorer 5.5 and later (with
latest patches)
The RSM uses the Internet Explorer WinInet.dll
13
PATROL Express
PATROL Express Features
14
Features Overview
– Remotely monitors Infrastructure and Web transactions
> Infrastructure
> Web-based transactions
-
–
–
–
–
–
Operating systems
Databases
Applications
Network/Storage Devices
- HTTP/HTTPS
Measures against user-defined service level objectives
Provides ‘business service’ performance measurements
Sends real-time problem notifications
Provides centralized access to reports
Service enabled for end user access
15
PATROL Express Monitoring
• What it monitors
–
–
–
–
–
–
–
–
–
–
–
–
–
Windows (NT, 2000, XP, 2003)
Unix (AIX, HP-UX, Linux, Solaris)
Databases (Oracle, MS-SQL, Sybase)
Web servers (IIS)
Web transactions
Email (Exchange, POP, IMAP, SMTP)
Network devices
Storage devices
Port monitoring
Process monitoring
Windows event log monitoring
Text log monitoring
Custom parameter sets
• How it monitors
–
–
–
–
–
–
–
–
–
–
PerfMon
WMI
rstatd
Secure Shell (SSH)
SNMP
HTTP/HTTPS
SQL*NET
Ping
DNS
PATROL protocol
For more information, please refer to the PATROL Express Parameter Set Guide.
16
Monitoring Capabilities
Operating Systems
PATROL Express monitors basic operating system parameters, including
memory, CPU and disk utilization. PATROL Express can be configured to
monitor specific processes as well as how much memory and CPU a process is
using. It can also monitor both Windows Event logs and text log files for userdefined messages.
Databases
PATROL Express takes a snapshot of database performance, which includes
ensuring that the database is up and running and that it can monitor parameters
such as number of transactions, lock usage and active SQL statements.
Network Devices
PATROL Express monitors each interface (port) on the device to ensure it is
running and reports how much data has been transmitted – and at what speed.
In addition, it monitors the status of network devices, checking on availability
and reporting the system description of each device.
17
Monitoring Capabilities (cont’d)
Storage Devices
When monitoring storage devices, PATROL Express focuses primarily on
availability, such as up/down status and environmental systems, including fans,
power supplies and temperature; asset information, such as vendor, model,
serial number and firmware; and configuration, such as device capacity and
number of ports.
Web Transactions
PATROL Express monitors the performance and availability of Web transactions
using HTTP and HTTPS. PATROL Express supports all major dynamic HTML
techniques, such as JSP, ASP and CGI, and popular content types, such as
Microsoft Word, PDF and plain text.
18
Service / Element Monitoring
- Cumulative “current status” at various levels:
• Company
• Service
• Element
• Parameter
- Parameter performance graphs
- PATROL Express API can be used to retrieve
current and historical performance data, e.g.:
https://patrolexpress.bmc.com/gethistdata.do?user=t1&password=secret&
format=csv&element=localhost&paramset=Windows_2000&
parametername=TotalCPUUtilization&startdate=
06-18-2003-06-00PM&version=3.0
19
20
21
Parameter Performance Graph
22
Service Measure Reports
• Charts at the account and service levels
– Availability reports
• Availability – percent of time with no critical alarms
• Availability vs. Goals – percent of time availability goals were met
– MTTR reports
• Mean Time To Repair Critical Alarms
• Mean Time To Repair vs. Goals – percent of time MTTR goals were
met
MTTR is the average length of time it takes to fix a problem that
caused an alarm state on a monitored element.
• Charts at the element level
– Availability
– Mean Time To Repair (MTTR)
23
24
Web Transaction Reports
• Reports specific to Web transactions
– Service Measures reports
• Path Time vs. Goals (for Web pages)
• Page Time vs. Goals (for Web pages)
– Diagnostic charts
•
•
•
•
•
Total path time
Slowest five steps, fastest five steps
Page time
Page component (DNS, first byte, resources)
Shown as averages or for specific locations (RSMs)
25
Path Time Charts
Total Path Time with Goal
Path Time by Step
26
Page Time Charts
Total Page Time with Goal
Page Time Breakdown
27
Worldwide Web Site Monitoring
28
Alerts
–
–
–
Historical log of all alerts sent out
Alerts may be sent via email or SNMP
Alert log entry includes:
•
•
•
•
•
–
Element service
Element name and network name or IP address
When it was detected
State: critical alarm, alarm, warning or normal
Who was notified
Built-in blackout periods
29
Notification Examples
• When to notify (escalation polices)
– Notify Joe immediately
– Notify Jane if the alarm is not fixed in 10 minutes
• What types of alerts to send (critical alarm, alarm, warning or
all alerts)
– Notify Joe on critical alarms
– Notify Jane on critical alarms and alarms
– Notify Bill on critical alarms, alarms and warnings
• Notify different recipients for different services or elements
– Notify only Joe for Service X alerts
– Notify only Jane for Service Y alerts
30
E-Mail Notification Example
31
Customizing PATROL Express
– The Parameter Set Editor (PSE) is an interface for
adding custom parameter sets
– Primary way to customize PATROL Express
– Custom parameter sets are added via PSE using the
following protocols:
• PerfMon
• SNMP
• PATROL
– Accessible by administrators only
– Administrators may hide existing parameter sets
(e.g., do not show Solaris)
32
PATROL Express Security
• PATROL Express was designed with security in mind.
• All traffic, including log on credentials between the SIP
and the RSM, is compressed and encrypted using
HTTPS.
• PATROL Express supports SNMP v3 – encrypted
credentials and data transmission.
In addition, the following precautions have been taken:
– The SIP uses Secure Socket Layer (SSL) IDs with strong
encryption
– Users are required to authenticate using their user IDs and
passwords to access the SIP
– Portal and element credentials are stored encrypted in the
database of the SIP
33
Summary
PATROL Express improves quality of
service (QoS)
– Minimizes deployment and configuration
– Notifies customers of outages
– Enables a System-wide monitoring tool
– Measures against user-defined service level
objectives
– Provides Web-based Service Measures
reports
34
Summary (cont’d)
PATROL Express provides great functionality "out-of-thebox“. Other products often require several loosely
integrated components to accomplish the same level of
functionality:
–
–
–
–
–
–
–
–
–
secure web-interface
secure communications between RSM and SIP
secure storage of credentials
easy clustering of RSMs
email notifications
pager-ready messages & notifications
blackout period support built-in
automatic report generation/emailing
easy administration & management
35
Why PATROL Express?
– Deploy quickly, without an agent on a box?
– Reactive or NO monitoring in less strategic
areas?
– Improve service levels (QoS) to customers?
– Remotely monitor performance and
availability globally?
– Other tools too expensive to expand
business segments?
– Are you being asked to do more with less?
36
PATROL Express in the PATROL Management Architecture
37
Questions?
For more information, please contact KOAN-IT:
Email: [email protected]
Phone: (613) 591-9131
38