Windows Server Health Checkup

RSS

Windows Server Health Checkup

20 Mar

Windows Server Health Checkup

CPU

Occasional high CPU spikes are ok as long as you are aware of the process causing this. A server should maintain 80% CPU utilization for an extended period of time. If it does it may be time to upgrade. Its a good idea to keep Task Manager open during the duration of your troubleshooting to see trends.

Check CPU Usage

Open Task Manager
Check the Processes tab, ensure there are no processes consuming excessive CPU
Check the Performance tab, ensure there are no single CPU’s that have excessive CPU usage

Check CPU HW

Open Device Manager (right click computer –> Manage)
Ensure that no CPU’s have red X or yellow ! underneath the Processors

Processes

In-Depth Check
SysInternals:

Copy Process Monitor locally, then launch it.

Analyze each process and watch what operations open the reg keys, file etc.

Copy Process Explorer locally, then launch it.

Analyze each process based upon the number of threads, handles, loaded DLL’s, etc.

Memory

General rule of thumb is to make sure the general memory utilization does not exceed 80%within a given period of time.

Check Memory Availability

Open Task Manager
Select the Performance tab
Look at the Physical memory box, and multiply the total memory by .2
If the total available memory is less than this number then the box is currently utilizing more than 80 percent of the memory.

Current utilization by process

Select the Process tab
Check the ‘show processes from all users’ box in the bottom left corner
Click the column header ‘Mem Usage’ to sort the processes by memory utilization, highest to lowest. This will help you determine what processes are currently utilizing the memory on the box and can help you narrow your search for memory intensive processes.

Network

Check NIC HW

Verify both ends of the network cable are securely seated in the port
On the back of the server verify you have a green blinking link light on the NIC port
Verify NIC HW is working properly by using Device Manager and ensure the active NICs are showing green
Verify gateway, IP, subnet mask, DNS, DNS suffixes, etc. are properly configured.
If everything is properly configured and HW is working, you should be able to get a ping response from the gateway.

Check Network Connections
Here are some other checks you should perform to ensure proper network connectivity:

ipconfig /all will display all you TCP/IP settings including you MAC address
ipconfig /flushdns will flush your dns resolver cache
ipconfig/displaydns will display what is in your dns name cache
Netstat -an command will show all the connections & ports from a machine
Nbtstat command will show net bios tcp/ip connection stats
Tracert <IP or DNS Name> command will show you the path the packet takes, the routers, and the response time for each hop.
pathping <IP or DNS Name> command combines ping and tracert to the 100th degree. It pings each hop 100 times and is great for testing wan connectivity

Disk Space

All kinds of bad stuff can happen when your disk space is filling up. The best way to alleviate this is to write a script to notify you when you reach a certain threshold. In a future post I’ll share a method for you to do just that…however if there is a problem and you need to perform a health check then here is how you check the space the old fashion way.

To check disk space manually:

Right Click on My Computer
Select Manage
Select Disk Management
Validate each disk more than 10 percent free space

Event Logs

Event logs can reveal a more historical perspective on what is going on with the system and applications. Things to look for when troubleshooting event logs is to query either the system or the application logs and look for the presence of events that have a timestamp near the time of the issue you are troubleshooting.

Events have 3 categories in the event viewer:

Informational: Noted with a white icon and letter ‘i’. Successful operations are logged as informational. Usually not used in troubleshooting problems or failures
Warning: Noted with a yellow icon and exclamation point. These usually are looked up as they serve as predictive future failure indicators, such as disk space running low, dhcp ip address lease renewal failures, etc.
Error: Noted with a red circle icon and ‘x’. These are indications that something has failed outright and are a good starting point for troubleshooting.

When looking at event logs, use the information to determine the following:

Is the incident tied to a particular time or outage incident?
Is this a one-off, or has this particular error occurred multiple times in the past?
Does this error appear on other systems or is it unique to the system that has failed?

Services

Troubleshooting services should be limited to the specific that is affected by the problem being troubleshot. Each server will have specific services varying upon the types of applications running. You should document how your servers services are configured to and compare that to the server in question to see if anything is not configured correctly.

Cluster

Servers that host applications and services that require high availability should be clustered so that if one node fails the other can pick up the workload. Clustered servers need the same type of health checks as stand-alone systems except you will want to check on the health of the cluster.

Check Cluster Resource Status

Open Cluster Administrator: Log onto server, select Start –> Run –> cluadmin
Check the Resources and ensure all are Online
If Cluster Administrator does not open, ensure that the Cluster Service is running on the node.
Cluster resource status can also be checked from a remote server. From a command prompt, just type – cluster res <cluster name>

Client Side Health

Right click on My Computer, select Manage
Open Device Manage
Drill down to SCSI and RAID Controllers, verify that the HBA HW is visible and does not show any errors
If it does not show up in Device Manager, you may need to re-scan for the HW, re-seat the fiber card, or re-install the driver.
If the HBA is showing healthy in Device Manager, open the tool that you use to view configuration and settings for the fiber card and verify there aren’t any transmit/receive errors on link statistics or counters

Switch Health

Make sure fiber is properly connected to each switch
Make sure switch has no errors
If you’re using zoning verify it is properly configured

Check Fiber and SAN Connectivity

Log onto san appliance and verify that the SAN is in general good health and no major errors are present for the controllers, loops, switches, or ports.
Ensure that the LUNs are presented to the servers in the cluster

NLBS

Some applications will require you to spread the load across multiple servers. Web servers are a very popular choice to network load balance. As with clusters we will need to check the status of the load balancing.

Check NLBS Status CMD Line

From a command prompt on the local system, run ‘wlbs query’. This will give you the convergence status of the local node with the nlbs cluster.
Other useful NLBS commands: wlbs stop (stops nlbs), wlbs start (starts nlbs), wlbs drainstop (drains node)

Check NLBS Configurations

Open up the network properties –> Network Load Balancing, right click & select Properties
On the Cluster Parameters tab, verify that the IP address is configured for the shared NLBS IP and that the subnet mask, domain, and operation mode are configured correct1y.
On the Host Paramters tab, make sure each node of the cluster has a unique host identifier. Also verify the IP and subnet mask are configured for the local values.
Also make sure that your switch has a static ARP entry if using multi-cast NLBS. The entry should be that of the virtual MAC of the cluster. To get the virtual MAC of the cluster, you can run the following command: WLBS IP2MAC <virtual IP address>

Name Resolution

To healthcheck name resolution, open a command prompt and enter the following

nslookup <servername>

Verify that the servername is correctly entered in DNS

If a record does not show up in the DNS query, or maps to a different name, perform a reverse lookup by IP address to see what name is associated with the IP address * nslookup <IP address>

If no name shows up associated with the IP address, log into the domain controller and check the DNS records for this particular name/ip address

From a Domain Controller go to start–>run–>dnsmgmt.msc
Expand the Forward Lookup Zones
Expand the zone for you primary zone that holds the records for the system/s you are troubleshooting

Validate that the record exists. If it does not exist manually enter the record name and IP address by right clicking on this same zone,

Select new host (a)
Enter the name and IP address
Check the box next to Create associated pointer (PTR) record
Click add Host

Additionally log back into the node that you manually entered the record for and ensure that DNS is registering in DNS

Right click on the My Network Places icon on the desktop and select Properties
Double click on the primary adapter
Select properties
Highlight internet protocol (TCP/IP) and select properties
Validate the IP addresses of the DNS servers are correct
Select Advanced
Select DNS tab
Make sure the box is checked next to Register this connection’s address in DNS

2 Comments

Posted by EP. Rajesh on March 20, 2012 in Windows

2 responses to “Windows Server Health Checkup”

Sivakumar

March 20, 2012 at 7:00 am

Thanks ~

Reply
Sanjay

March 20, 2012 at 7:18 am

Awesome brother…thank U for this information…Hope We will get more information..

Reply

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

	Ankush Jha on Exchange Server Poster
	Nepoleon.S on AD replicates information in 2…
	dubturbo review on Slow Logon windows domain
	Gino on Choosing the Right Resiliency…
	Suitmoulin on RAID – Redundant Array o…
	Vinod Rana on Differences between Exchange s…
	sudhakantjha on Windows Server 2008 File …
	David Cordero on Differences between Exchange s…
	Murali on Differences between Exchange s…
	imkottees on Differences between Exchange s…
	rene on Exchange Server Poster
	VJ on File structures in exchange se…
	Kabilan Subramaniyan on Troubleshooting Mail Flow Prob…
	Senthil Kumar on Exchange Server Poster
	imkottees on Exchange Server Poster

Welcome to Learn and Share Blog.