Start a Conversation

Unsolved

This post is more than 5 years old

1508

July 23rd, 2012 12:00

I need some help to interpret the symstat output

Hello All,

I have a ticket to investigate  on why a sun server with Oracle Database is running with poor performance . The server is using VMAX luns .

The disks are configured on R5 (7+1) .

I only count with the symstat command to try to sort out this issue.

Please see below the output of one of the Luns. ( the tendency is the same in the other Luns related with this DB) .

                                         IO/sec         KB/sec     % Hits   %Seq     Num WP

12:30:14                READ  WRITE    READ  WRITE  RD  WRT  RD  WRT  Tracks

146D (Not Visible )      0      0       0       0 N/A N/A  N/A N/A     35

146D (Not Visible )      0      0       0       0  N/A   N/A   N/A   N/A      4

146D (Not Visible )      2      2       6    1029 100 100    0   0      4

146D (Not Visible )      3      8       5     402  67 100    0   0      9

146D (Not Visible )      0      5       0      23 N/A 100  N/A   0      7

146D (Not Visible )      0     11       0      46 N/A 100  N/A   0     14

146D (Not Visible )      0      0       0       0 N/A N/A  N/A N/A     16

146D (Not Visible )      0      0       0       0 N/A N/A  N/A N/A     80

146D (Not Visible )      0      5       0     298 N/A 100  N/A   0      6

146D (Not Visible )      0      4       0    1594 N/A 100  N/A   0    178

146D (Not Visible )      0      5       0    1411 N/A 100  N/A   0      3

146D (Not Visible )      0      0       0       0 N/A N/A  N/A N/A     48

146D (Not Visible )      0      5       0     571 N/A 100  N/A   0     17

146D (Not Visible )      0      2       0    1582 N/A 100  N/A   0     32

146D (Not Visible )      0      5       1    1315 N/A 100  N/A   0      8

146D (Not Visible )      0     11       0     692 N/A 100  N/A   0      8

146D (Not Visible )      0     11       0    1709 N/A 100  N/A   0     20

The tendency is the same for each one of the Luns , as you can see I have a big number on KB/sec ( Writes) and small number of IO/sec (Writes).

Based on this info does enybody could give me a Hint on this? I would appreciate If somebody could tell me any suggestion and what else I can do,

I understand is not easy to get a final conclusion on this way but any help is welcome.

Thank you

79 Posts

July 23rd, 2012 12:00

Alberto:

I am not too familiar with Sun Solaris but looking at the output of sysmstat, I can say that it is insufficient to analyze the issue. Try to get the %busy, Average Wait time and Average Service time on the Host. I am sure "sar" outputs should be able to give you that.

Just take a look at the performance related to all the LUNs present and see if there is a pattern.

Also, if you have Symmetrix Performance Analyser installed - that would also help you check the TDEVs/Devices which have high i/os.

Kennedy.

1.3K Posts

July 23rd, 2012 14:00

Small IOs/sec and large KB/sec means large IO size.  IO writes larger than 64k won't increase throughput and only increase response time.  If these are log writes, consider re-striping the log devices to 64k which should significantly decrease the write response time (I assume the complaint is write response time, but you didn't say)

32 Posts

July 23rd, 2012 15:00

Ok, thanks for your help , I will open a SR with EMC and let you know what happened .

32 Posts

July 23rd, 2012 15:00

It is correct Quincy , the issue is write response . This is the Oracle recommendation :

The AvWrTm (average write time) is huge here, as high as 93480 (ms), that means 93sec to complete a write IO request. This is very unacceptable at IO layer. I can not see much from v$system_event, it does not seem recovery is happening.

ACTION PLAN

--------

Please engage your system admin to check disk IO and storage configuration, find out why IO is so slow

Could you please respond my following questions?

What would be the procedure to re-stripe the log devices to 64k? How can I see what stripe size the log files have currently?

In order to fix this issue we have to do a change like that you suggest or the  Oracle admin could change something in the DB configuration  to fix this?

Appreciate your help

Martin,

1.3K Posts

July 23rd, 2012 15:00

Sounds like you need to open a service request. There is something wrong.  Even with very large IOs and a large queue, you should not have multi second write times.

32 Posts

July 23rd, 2012 16:00

Hi Quincy ,

I just realized that the server has also LUNs of a CX4 and those LUNs are also having the same issue , any other sugestion ?

Thanks

1.3K Posts

July 23rd, 2012 16:00

So same answer, sounds like something is broken and needs fixing, but it could be something in the host, could even be a reporting error.

859 Posts

July 23rd, 2012 19:00

Hi Alberto,

As Quincy said, it could be something on the host. Make sure the HBA drivers,fw and required flags are set as per EMC support matrix.

regards,

Saurabh

No Events found!

Top