HP PowerPoint Advanced Template

Download Report

Transcript HP PowerPoint Advanced Template

Availability
Manager V3.0-2
Overview
Barry Kierstein
Hewlett-Packard
© 2009 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice
Overview of This Session:
•
Availability Manager Overview
• Availability Manager Components
•
•
•
•
Availability Manager Installation
Availability Manager Configuration
Availability Manager New Features for V3.0-2
•
Availability Manager Gotcha Items
Availability Manager Unsupported Features
Availability Manager Data Collection Considerations
•
Availability Manager Live Demonstration
•
Availability
Manager
Overview –
Let’s get
started!
Availability Manager Overview
•
•
•
•
•
•
Real-time display of system(s) being monitored; similar to
MONITOR but with additional capabilities
Error and Information display: issues warnings when
resources are low
Can be used to “fix” various problems
Display portion is easy to learn (point and click)
Default setup is a good place to start finding performance
bottlenecks and resource contentions
With some customization, the Availability Manager helps
pinpoint problems specific to the systems being
monitored
Availability Manager Overview
•
Collects data on one or more nodes (systems),
analyzes the data, displays it, and issues
warnings
• For Alpha and VAX, requires only an OpenVMS
license to collect data
• For I64, requires the EOE platform, or Avail_Man
PAK to collect data
• Separate installation kit:
− Included on the CD-ROM distribution kit
− Download is available from the OpenVMS homepage
•
Manuals included in hardcopy documentation kit,
distribution kit, and on-line documentation kit
Availability Manager Groups
•
Systems can be grouped together for analysis
•
All members of an OpenVMS Cluster must have
the same group name for correct clusterwide data
collections
•
Unclustered systems can be put into the same
group
•
Availability Manager can be configured to display
information only from specified groups, reducing
the number of systems being monitored
•
Also knows as AMDS groups
Availability Manager Components
•
Three parts:
− Data Collector gathers data on system(s) being
monitored
− Data Analyzer collects data from the Data Collectors
and displays the data
− Data Server allows Data Analyzers to collect data over
an IP-based wide area network (WAN)
Availability Manager Components
•
Data Analyzer:
− A Java-based application
− Runs on OpenVMS Alpha platform from V7.3-2 and
later with Motif or X-Windows
− Runs on OpenVMS I64 platform from V8.2-1 and later
with Motif or X-Windows
− Runs on Intel platforms under Windows 2000 and
Windows XP Professional
Availability Manager Components
System Overview window
Availability Manager Components
•
Data Collector:
− Consists of an OpenVMS device driver, configuration
and startup files
• Device driver file is RMDRIVER.EXE on VAX,
SYS$RMDRIVER.EXE on Alpha and I64
• Device shows up with $ SHOW DEVICE RMA0: command
− Runs on Itanium, Alpha and VAX platforms
− Runs on OpenVMS V6.2 and later
− Sends out a Hello multicast message to announce an
OpenVMS system to the Data Analyzer
− Only collects data when a Data Analyzer sends a data
collection request, uses little CPU time
Availability Manager Components
•
Data Server
− The Availability Manager uses its own protocol (AMDS
protocol) for communication between the Data Analyzer
and each Data Collector
• Connection not dependent on network software to work (IP,
DECnet, LAT, etc.)
• Data collection and fixes often work even when the network on
the system is not functioning or the system is hung
Availability Manager Components
•
Data Server
− Data Server allows a Data Analyzer to collect data over
an IP-based network
• Data Server resides on the same extended LAN as the
OpenVMS systems so it can communicate to the Data
Collectors using the AMDS protocol
• Data Analyzer connects to the Data Server using an IP-based
secure socket connection over a WAN or LAN
− A Data Server can accept connections from several
Data Analyzers
− A Data Analyzer can connect to several Data Servers
• For redundancy, one could have two Data Servers on the same
LAN
Availability Manager Installation
•
Availability Manager kits
− OpenVMS Data Collector kit and manifest for secure
delivery
• Contains files for each OpenVMS version and platform
• Can use SYSMAN> DO command to install on a cluster
• If updating the Data Collector, a system reboot is necessary to
remove the old Data Collector
− OpenVMS Data Analyzer/Server kit and manifest
• Contains files for both the Data Analyzer and Data Server
• System reboot is not necessary with this kit
− Windows 2000/XP kit
• Normal Windows installation, requires a reboot to install a driver
• Install using Administrator account or equivalent
Availability Manager Configuration
•
Data Collector configuration
− Data Collector password
• In file SYS$MANAGER:AMDS$DRIVER_ACCESS.DAT
• Authentication between Data Analyzer and Data Collector
• A Data Collector can have several passwords allowing for
various access rights and scopes
• Considerations for passwords
− Access rights – Read, Write and Control
− Scope for a particular password
• OpenVMS – password for all OpenVMS systems
• AMDS group – common for clusters
• Individual node
Availability Manager Configuration
•
Data Collector configuration
− Data Collector settings
• In file SYS$MANAGER:AMDS$LOGICALS.COM
• AMDS$GROUP_NAME – set as desired, one per cluster
• AMDS$DEVICE – Network adapter used for communications
using the AMDS protocol
− Data Analyzer connections to Data Collectors
− Data Server connections to Data Collectors
− Note: Data Analyzer to Data Server connections use the IP
protocol. The network adapters used are controlled by the
IP stack on the particular system.
Availability Manager Configuration
•
Data Collector configuration
− Data Collector settings
• Hello multicast message settings
− AMDS$RM_DEFAULT_INTERVAL – Broadcast interval in
seconds for Hello multicast messages when the system is
not being monitored
− AMDS$RM_SECONDARY_INTERVAL – Broadcast interval
in seconds for Hello multicast messages when the system is
being monitored
− Determines how quickly the Data Analyzer discovers all the
systems. For instance, if the secondary interval is 20
seconds for each system on a LAN, then it will take up to 20
seconds for all the systems on the LAN to be discovered.
− Each message is one packet of around 200 bytes,
contributes little to the network traffic
Availability Manager Configuration
•
Data Collector configuration
− Data Collector startup
• SYS$STARTUP:AMDS$STARTUP is used to start and stop the
Data Collector, P1 is the function
− START – Loads the configuration data and passwords, and
starts the Data Collector. Put this in command in
SYS$MANAGER:SYSTARTUP_VMS.COM after the
network stacks have been started so the MAC addresses of
the network adapters have their final value
− STOP – Stops the Data Collector
− RESTART – Stops and restarts the data collector. This is
useful if you change the configuration data or passwords,
and want the changes loaded into the Data Collector
− STATUS – Current status of the Data Collector
− HELP – List of possible functions
Availability Manager Configuration
•
Data Server configuration
− Authentication between the Data Analyzer and the Data
Server is by Kerberos public and private keys
− Create key pair on Data Server system
• Start Data Analyzer, create keys, export public key
− Create key pair on Data Analyzer system
• Start Data Analyzer, create keys, copy to Data Server system
− Covered in Chapter 2 of the Availability Manager Users
Guide
Availability Manager Configuration
•
Data Analyzer configuration
− Import any Data Server public keys
• Start Data Analyzer
• Import keys in Network Connection dialog box
− Input Data Collector passwords
• Use the Security tab in the Customization dialog box
• Passwords can be entered at the appropriate level
− OpenVMS level – Customize in System Overview
− AMDS group level – Right-click on group in System
Overview
− Node level – Customize in Node pane or right-click on a
node
Availability Manager Startup
•
The first window to appear is the System
Overview Window
•
Event data also goes to the event log file
AnalyzerEvents.LOG
•
On OpenVMS, you can set the location of the
event log file with logical names
System Overview Window
•
•
•
•
•
Initially the System Overview window is empty.
Systems are displayed as the Hello multicast
message is received from each Data Collector
Shows all the systems being monitored in one
window
Information includes the Name, Utilization, O.S.
and Hardware versions
Allows customizations at the application and
operating system levels
Shows the connection used to gather the data
(network adapter, connections to Data Servers)
Availability Manager
Network Connection dialog box
Availability Manager
System Overview Window
Availability Manager
Data Server Statistics
Availability Manager
Read from Data Server Statistics
Availability Manager
New Features
•
Data Collection over IP
− Data Server to tunnel AMDS protocol over IP
− Avail_Man_Ana kit renamed to Avail_Man_Ana_Srvr to
show that the Data Analyzer and Data Server reside in
the same kit – must remove old kit due to name change
•
Java 5.0 JVM used by Availability Manager
− Increased performance on OpenVMS
− Requires ODS-5 disk – use /DESTINATION qualifier
when installing the Data Analyzer/Server kit to direct
the installation on an ODS-5 disk
− AMDS$AM_DISABLE_OFFSCREEN_PIXMAP_SUPP
ORT logical can help remote X-window display
performance
Availability Manager
New Features
•
System Overview window changes
− New and revised columns
• PFLTS shows total and hard page fault rates
• PFW/COM shows number of processes in PFW and COM
states
• DC shows Data Collector capability version and Managed
Object registration state
• CPU Qs revamped to have more consequential states
− PFW and COM removed, leaving COMO, MWAIT, COLPG
& FPW
− If total is non-zero, show all counts as n/n/n/n (see tooltip)
− Events have changed to reflect PFW and COM removal
− Memory tooltip shows memory and alignment fault info
• Added HIALNR event for high alignment fault rate
Availability Manager
New Features
•
Data Collection for Logical Disks (LDcn:) devices
•
Event Log enhancements
− Each data connection has its own log file
− Status column – shows when a threshold event begins,
ends, is cancelled or expires
− EventKey – unique key for an event on a node
• For instance, all HICOMQ events for node APPLE
• Can use $ SEARCH to easily find all occurences of an event
− EventID – unique key for a single event
• Easily find the BEGIN and END/CANCELED/EXPIRED record
for an event
Availability Manager
New Features
•
Fixes
− Force a disk volumn out of Mount Verify state
− Force a shadow set member out of a shadow set that is
in Mount Verify state
•
Data Analyzer supports MAC address changes
− CFGDON and PTHLST show MAC address used
− CHGMAC and NEWMAC events show address
changes
•
SYS$STARTUP:AMDS$STARTUP.COM
− STATUS parameter shows RMA0: status
− START and RESTART have LOG qualifier to output
configuration data sent to RMA0:
Availability Manager
Gotcha Items
•
Make sure the most recent AMNDIS50.SYS file
on Windows systems is installed
− Correct date is Nov 28, 2006
− Driver from earlier date can crash system when a
second Data Analyzer is started
•
OpenVMS Data Analyzer/Server V3.0-2 kit
requires Data Collector V3.0-2A kit
− A check for required logical names is done when the
Data Analyzer or Data Server is started. The logical
names are defined in AMDS$STARTUP.COM, which is
in the Data Collector kit.
Availability Manager
Unsupported Features
•
Installation on Windows Vista
− Right-click -> Properties to install under compatibility
mode
− More testing and compatibility knowledge needed to
put on the supported list
•
Running the Data Analyzer on other Oses
− Work done to allow the Data Analyzer to connect to
Data Servers by using the JVM only, tested on Linux
− Install JVM on system
− Copy *.JAR and *.ZIP files into a subdirectory
− Create script with JAVA command line from
AMDS$AM_RUN.COM
Availability Manager
Data Collection Considerations
•
Data Collections on a local LAN typically finished
quickly - in less than a second or two for the
largest data collections with many continuations
•
Using a Data Server slows down the round trip
time for data collections with many continuations
− Affects systems with many processes, disks or large
resource hash table
− DCSLOW events are signaled when the data collection
takes longer than the data collection interval
− DCCOLT events document how long the data collection
actually took in seconds
Availability Manager
Data Collection Considerations
•
Lock contention data in particular can take many
continuations to finish
− 1K resource hash table entries were scanned per
collection, so large tables resulted in hundreds of
continuations
− Since the data collection time is returned in the AMDS
packet, the number of hash table entries scanned is
now limited by a 1ms limit. This was the maximum
collection time seen in scanning 1K hash table entries
on a DEC 3000/400. On larger Alphas, this time limit
results in scanning about 3K hash table entries.
− This limit can be changed if necessary, but take care as
the IOLOCK8 spinlock is held during the data collection
Availability Manager
Live Demonstration
•
Initial key configuration for Data Analyzer and
Data Server
•
Overview of new features
Availability Manager
Contact Information
•
Barry Kierstein –
− [email protected], [email protected]
•
Shubhabrata Bose
− [email protected]
•
Karthigeyan Kasthuriregan
− [email protected]
•
Srividhya Subramanian
− [email protected]