Low-Cost, High-Functionality Systems Management: OpenManage IT Assistant and Windows 2000 Server
By Shelley Palmer-Fettig (Issue 4 2000)
Dell's OpenManage IT Assistant 6.0 in conjunction with Microsoft Windows 2000 Advanced Server management tools allows technical staff to monitor and take action on hardware, OS, and application events. One can monitor any perfmon threshold event, monitor critical services and the event log, and translate those events into SNMP traps to be sent to the IT Assistant.
Dell's OpenManage IT Assistant (ITA) 6.0 is a browser-based management console that monitors and manages Dell servers, desktops, and portables. It has features such as system discovery, event management, status polling, asset reporting, remote system configuration, paging, and e-mail event notification that are common to the leading hardware management consoles available today (see Figure 1 and the sidebar "Dell IT Assistant Highlights").
Figure 1. Dell OpenManage IT Assistant Console
To manage Dell devices, ITA requires the installation of industry-standard desktop management interface (DMI), Simple Network Management Protocol (SNMP), or Common Information Model (CIM) instrumentation. The OpenManage Server Agent delivers this instrumentation for PowerEdge servers, and OpenManage Client Instrumentation (OMCI) delivers it for desktops, workstations, and portables.
This article goes beyond Dell's IT Assistant to discuss ways to exploit the management tools and new management functionality intrinsic to Windows 2000. A great complement to Dell's hardware agents and IT Assistant is Windows 2000's Advanced Server's "Event to Trap Translator" for OS and application monitoring. Configuring this tool to send Windows event log events as SNMP traps to IT Assistant can be broken down into a five-part process:
- Configure Windows Services so that an attempt is made to revive any unintentionally downed service and define the associated write to the event log.
- Configure thresholds and alerts for specific perfmon objects. When a threshold is broken, Windows allows for an "action" to be taken; in this case, that action will be to write to the Windows event log.
- List critical events currently being written to the event log, which act as proactive notice to technical staff when there is a problem with applications or security on a server.
- Configure the Event to Trap Translator (evntwin.exe from the Windows 2000 Advanced Server install) to translate the events defined in steps 1-3 into SNMP traps and send them to IT Assistant.
- Configure IT Assistant to acknowledge the Windows 2000 SNMP alerts and to take action (such as e-mail and paging alerts) based on the needs of the IT department.
A word of caution here: Start slowly with a systems management initiative and initially create a small, critical list of alerts, then fine-tune and add to those events over time. Nothing kills an effort to manage systems proactively faster than bombarding technical staff with a barrage of pages. A good plan is to create a pilot program of 10 servers which is built upon during a month of continual tuning of the events and actions, before rolling the final management configuration out to all of the production servers.
A Case Study Using IT Assistant to Monitor Hardware, Applications, and OS Events
With ITA as an SNMP reception platform, the power of the Windows 2000 Server's native management functionality can be harnessed to provide a company with a broad spectrum of event management-at no extra cost.
To illustrate how ITA might be used with functions intrinsic to Windows 2000, we will follow Tom, a hypothetical system administrator who works at Gnufirm.com in Boston, as he installs ITA 6.0 and the Windows management tools. Tom's objective is to manage and monitor 40 file, print, and application servers remotely.
Configuring the Management Server
First, Tom dedicates a Dell 2450 Server with 128 MB of RAM to the role of management server and configures it with the following steps:
- Install Windows 2000 Advanced Server
- Load Microsoft Simple Mail Transfer Protocol (SMTP) Server
- Install and configure SNMP to send and receive traps with a community string other than the default
- Assign a static IP address to the server and enter it into Domain Name System (DNS) if applicable
- Install ITA from the OpenManage Applications CD or download from www.dell.com
- Configure ITA to discover only SNMP-enabled devices on the subnets to be monitored
Configuring Servers to be Monitored
As Tom builds the Windows 2000 production servers, he adds the Dell management and monitoring capabilities and configures all as follows:
- Enable SNMP with the "Trap" function set to the IP address of the ITA Server
- Set the SNMP community name to be the same as the management server
- Install the OpenManage Server Agent, Array Manager, and Dell Remote Assistant Card (DRAC) instrumentation (if applicable) from the Dell OpenManage Applications CD on each "managed" server
With this configuration in place, Tom can remotely configure and monitor the RAID controller via Array Manager. He also can remotely control servers located in Hartford, Connecticut, and New York City via Remote Assistant. This configuration enables Tom to receive hardware events in ITA from any hardware components on the managed servers.
OS and Application Monitoring
Now that Tom has the Dell hardware management configured, he proceeds to implement some core application, Perfmon, and security monitoring. He also wants to configure some critical services to be as self-healing as possible. Tom chooses a "pilot server" to initially create these events. Once the events are functioning well on the test server, he will push the event configuration to other servers. The following sections describe the steps that Tom follows.
Service Management and Monitoring
The first step involves configuring the service management and monitoring functions within Windows 2000 Server.
- From the Start Menu, choose Administration Tools > Services.
- Double-click on critical application services that have a greater probability of going down.
- From the Recovery Screen, select Restart the Service for the first and second failure and leave the subsequent failures at take no action. Tom's theory is that if a restart has failed twice on these critical services, he needs to know about it.
Performance Monitor—Create a Base of Managed Objects
Tom creates a base of managed Perfmon objects that will write to the event log and eventually send important proactive events to the ITA console.
- From the Start Menu of one managed server, choose Administration Tools > Performance.
- From the left side of the Performance Console, right-click on Alerts and select new alert settings; type BasicTraps in the name box.
- From the General screen, select Add.
- Select the counters from list to receive alerts (choose objects from which information is needed on any server).
- Continue adding Perfmon alerts to the BasicTraps group as long as the polling time will be the same. For a Perfmon object to be polled at a different rate, another group must be created.
- Close the Select Counters screen after all objects are chosen.
- In the configuration screen for the list of objects, select each counter and set the alert parameters in the middle of the screen.
- After the last object on the list has been configured, set the sample data interval at the bottom of the screen. (Tom keeps the polling at 15 minutes to ensure that he will not keep the server busy with monitoring tasks.)
- Select the Action tab and check only Log an action in the application event log.
- Select Schedule and configure it if necessary. Select OK when this is complete (Tom wants all monitors to continue running, so this screen remains the same).
- From the Performance screen, right-click on the name BasicTraps and select save settings as, enter the file name BasicTraps, and place that in a shared directory to be used later when creating events on the remaining servers.
Tom is careful to choose the "critical few" alerts during this process. He selects memory, drive, and login type events compared to more application-oriented events. After configuring these core events on the servers, he adds Web and Internet Information Services-specific events on the Web servers, FTP events on the FTP server, and DHCP events on the DHCP-enabled servers, and so on. Figure 2 shows installation steps of "core" events to additional servers.
Figure 2. Steps for Distribution of Event Management Configuration
Configure Events to Be Sent as SNMP Traps
In the two previous steps, Tom set up the objects that he wanted to monitor so they would write to the Windows 2000 event log. Now he will pull it together and send the events to ITA as SNMP traps.
- From the command line on the managed server, type Evntwin server_name.
- When the GUI appears and shows Configuration Type, select CUSTOM, then select Edit from the right-hand column of buttons. Event Sources shows all possible event groups; on the right side, select the specific events for trapping.
To configure traps on the events created in steps one and two, choose the following:
- For the Perfmon objects, select source Application, then Sysmon Log, and choose event number 2031 and select Add.
- To monitor critical service failures, select System, Alerter and choose event number 2184 and select Add. Highlight this event in the event list and select Properties. At Generate Trap, choose if event count reaches, enter 2, and check the within time interval and enter 200 seconds. See Figure 3 for "Event to Trap" console.
Figure 3. Windows 2000 Server-Event to Trap Translator (Evntwin.exe)
Now add any existing log events (such as standard security or application events) that you wish to monitor. After all events have been added, follow these steps:
- Highlight the event and choose Settings—it is possible to Limit the trap length and Apply a trap throttle (limiting the number of traps sent in a given time period).
- Click on properties of each event to note the OID (SNMP object ID) of the trap. This will be used to configure the event in ITA. The Sysmon events will be OID 188.8.131.52.184.108.40.206.220.127.116.11.103.1073743855.
- From the Event to Trap screen, choose Export to save to a file that can be used on other servers.
- Apply and Exit.
To ensure that everything works properly before pushing these event configurations to other servers, Tom lowers the threshold point of several Perfmon objects on the managed (pilot) server and verifies that ITA receives the traps.
Further Configuration of IT Assistant
The online documentation supplied with ITA will provide direction for configuring reception of the Windows 2000 events for proper behavior when they reach the management server. See Dell OpenManage IT Assistant User's Guide: Event Management, which includes sample event management scenarios.
To configure reception of the Windows 2000 OS events in ITA, Tom chooses to create a new category named "OSBasics" under Configuration > Event Categories on the left side of the ITA console. In the Event Source Configuration screen, Tom adds the SNMP OIDs that he noted while creating these Windows 2000 events on the managed server (See the ITA online documentation that describes how to add new events).
Since Tom's group is also responsible for monitoring some of the local Cisco® switches, he creates a device group for them in ITA and configures the SNMP agents on the switches to send traps to the ITA server. To configure the Cisco events in ITA, Tom creates a category named "Cisco" and adds the SNMP OID for Cisco (18.104.22.168.4.1.9). Tom sets up the configuration of the actions for these events so that all events with this OID will page the person in his group with the most experience on the Cisco switches. Tom is careful to configure the switch SNMP configuration to send only critical events, but he may add more events in the future, as the need arises.
At this point, Tom reviews all of the events listed in ITA and configures the ones that he wants to take action on. He has some events write to a separate log for historic purposes. Other events result in an e-mail to on-call personnel (if it is after 5:00 p.m.), so that they will see the event when they arrive the following morning. Tom configures paging actions for all events that he deems critical and in need of immediate attention.
In creating the actions for these events, Tom configures ITA to send pages via the Exchange server over the LAN rather than using a dial-out process. He enters names equivalent to the paging process in the ITA Configure E-mail Action screen (Pages can also be sent directly over the WAN to the server used by the company's paging service).
Beyond IT Assistantm
Tom installs a publicly accessible tool named Multi-Router Traffic Grapher (MRTG) to produce Web-based graphs of collected SNMP data, which can be useful for viewing historic traffic and usage data. MRTG does SNMP gets from any agent, and it displays this data as Web-based graphs. This enables Tom to track network traffic on a router to CPU usage on a Dell server.
Tom also adds the Microsoft Network Monitor from Windows 2000 Advanced Server on servers in Hartford and New York to allow him to perform packet captures for remote troubleshooting, if needed.
Over time, Tom plans to add more management and monitoring tools to the management server. When a problem arises, he wants any useful resource at his fingertips or accessible from the Web. His objective is to minimize troubleshooting time by easy access to precollected data and events.
Tom's choice of inexpensive management tools will serve him well in the day-to-day monitoring and management of the servers. These tools offer low-cost, high-functionality systems management for remotely managing multiple servers in numerous locations.
Shelley Palmer-Fettig (firstname.lastname@example.org) is a senior consultant for custom solutions in Dell Technical Marketing. She has worked as a system administrator and systems management specialist during her eight years at Dell, and has experience with a wide range of technologies, operating systems, and management systems. Shelley has a certification in Electronics Technology from California Technology Academy.