Start a Conversation

Unsolved

This post is more than 5 years old

1217

July 16th, 2015 10:00

7.2 frequent multiscans

Anyone else seeing more frequent multiscan jobs under 7.2? My primary cluster seems to be starting them whenever there is a drive replaced, which is definitely a change of behavior from previous versions. I'm wondering if this is intentional or or just a new piece of (minor) weirdness.

130 Posts

July 16th, 2015 12:00

Hello carlilek,

The policy for MultiScan does not appear to have changed recently based on the last update to our Job Engine White Paper:

https://support.emc.com/docu51125_White-Paper:-Isilon-OneFS-Job-Engine.pdf?language=en_US

According to this document, the job would run on any device add group change, and a drive add would meet this condition. MultiScan will also be scheduled if: the cluster is out of balance by more than 5% or there has not been a successful MultiScan in 14 days (per the AutoBalance and Collect run policies).

So far I have not been able to find anything changed in our tracking system; however, it is not out of the realm of possibility that it was noticed this was not working as best it could and was fixed along with other jobengine improvements.

205 Posts

July 19th, 2015 08:00

OK.... now it's started another one out of the blue. No drive replacements, and it's not out of balance:

dm11-29# isi storagepool list

Name          Nodes                                      Requested Protection  HDD        Total    %      SSD      Total    %

------------------------------------------------------------------------------------------------------------------------------------

perftier      1-7,10,18,26-28,31-32,35-37,39-40,45,64-69 -                    134.5374T  308.5540T 43.60%  6.6928T  43.5456T 15.37%

- S200s      10,18,31-32,35-37,64-66                    +2n                  51.2356T  118.8580T 43.11%  642.189G 3.5883T  17.48%

- S200-bigssd 26-28,39-40,45,67-69                      +2n                  37.5507T  87.5227T  42.90%  1.6810T  9.6885T  17.35%

- S210s      1-7                                        +2n                  45.7512T  102.1733T 44.78%  4.3847T  30.2688T 14.49%

nltier        8-9,11-17,19-25,29-30,34,41-42,48-55,70-74 -                    3.07385P  3.89976P  78.82%  0b      0b      0.00%

- NL400s_4tb  8-9,11-15,17,19-20,29-30,34,41-42,48,70-74 +3n                  2.07641P  2.66333P  77.96%  0b      0b      0.00%

- NL400s_3tb  16,21-25,49-55                            +2n                  1021.3753T 1.23643P  80.67%  0b      0b      0.00%

------------------------------------------------------------------------------------------------------------------------------------

Total: 7                                                                      3.20523P  4.20108P  289.42% 6.6928T  43.5456T 49.31%

dm11-29#

Any ideas?

1.2K Posts

July 19th, 2015 18:00

Have you checked the balance within each pool, on node and disk level?

isi status

isi statistics  drive --node all --long

Do those frequent Multiscan job finish successfully, fail, or

get "system cancelled"?

Cheers

-- Peter

5 Practitioner

 • 

274.2K Posts

July 20th, 2015 03:00

You need to check in the pool level as well....as per peter_sero.  This would give an insight.

Thanks,

Kiran.

205 Posts

July 20th, 2015 05:00

Hi guys,

Unfortunately, running that command on a cluster with >1800 drives is probably not going to be particularly informative... but that does give me some idea where to go.

The multiscan does finish successfully. The only other clue I have is that I'm running a media scan now, which does not happen with any sort of frequency on this cluster.

205 Posts

July 20th, 2015 07:00

Well, the drives match up with the disk pools for the S210s, at least:

Drive Type Used Inodes
1:07 SAS 43.9 5.9K
1:08 SAS 43.9 5.9K
1:09 SAS 43.9 6.1K
1:10 SAS 43.9 6.0K
1:11 SAS 44.1 6.1K
1:12 SAS 44 5.9K
1:13 SAS 44.8 6.7K
1:14 SAS 44.8 6.5K
1:15 SAS 44.7 6.6K
1:16 SAS 44.7 6.4K
1:17 SAS 44.7 6.7K
1:18 SAS 44.8 6.5K
1:19 SAS 47.9 6.5K
1:20 SAS 47.9 6.6K
1:21 SAS 47.9 6.5K
1:22 SAS 47.9 6.5K
1:23 SAS 47.9 6.5K
1:24 SAS 47.9 6.5K
2:07 SAS 44.9 6.0K
2:08 SAS 44.8 6.0K
2:09 SAS 44.8 6.0K
2:10 SAS 44.9 6.0K
2:11 SAS 44.8 5.9K
2:12 SAS 44.9 6.0K
2:13 SAS 44.7 6.4K
2:14 SAS 44.8 6.3K
2:15 SAS 44.7 6.4K
2:16 SAS 44.8 6.4K
2:17 SAS 44.7 6.3K
2:18 SAS 44.7 6.4K
2:19 SAS 48.2 6.2K
2:20 SAS 48.2 6.4K
2:21 SAS 48.2 6.2K
2:22 SAS 48.3 6.2K
2:23 SAS 48.2 6.3K
2:24 SAS 48.2 6.3K
3:07 SAS 44.1 5.7K
3:08 SAS 44 5.8K
3:09 SAS 44 5.8K
3:10 SAS 44.1 5.8K
3:11 SAS 44 5.7K
3:12 SAS 44.1 5.8K
3:13 SAS 44.8 6.4K
3:14 SAS 44.9 6.3K
3:15 SAS 44.8 6.3K
3:16 SAS 44.8 6.3K
3:17 SAS 44.9 6.3K
3:18 SAS 44.9 6.5K
3:19 SAS 47.9 6.3K
3:20 SAS 47.9 6.3K
3:21 SAS 47.8 6.2K
3:22 SAS 47.8 6.2K
3:23 SAS 48 6.3K
3:24 SAS 47.9 6.3K
4:07 SAS 44.4 5.9K
4:08 SAS 44.4 5.7K
4:09 SAS 44.4 5.7K
4:10 SAS 44.4 5.9K
4:11 SAS 44.4 5.8K
4:12 SAS 44.4 5.9K
4:13 SAS 44.8 6.4K
4:14 SAS 44.7 6.6K
4:15 SAS 44.7 6.5K
4:16 SAS 44.8 6.6K
4:17 SAS 44.7 6.7K
4:18 SAS 44.7 6.6K
4:19 SAS 48.1 6.4K
4:20 SAS 48 6.4K
4:21 SAS 48.1 6.2K
4:22 SAS 48.1 6.4K
4:23 SAS 48.1 6.1K
4:24 SAS 48 6.1K
5:07 SAS 43.7 5.9K
5:08 SAS 43.7 5.8K
5:09 SAS 43.8 5.7K
5:10 SAS 43.7 5.8K
5:11 SAS 43.8 5.7K
5:12 SAS 43.7 6.0K
5:13 SAS 44.7 6.3K
5:14 SAS 44.8 6.2K
5:15 SAS 44.7 6.2K
5:16 SAS 44.7 6.4K
5:17 SAS 44.7 6.4K
5:18 SAS 44.7 6.3K
5:19 SAS 48.1 6.4K
5:20 SAS 48.1 6.5K
5:21 SAS 48.1 6.4K
5:22 SAS 48 6.5K
5:23 SAS 48 6.6K
5:24 SAS 48.1 6.5K
6:07 SAS 44.8 5.8K
6:08 SAS 44.8 5.9K
6:09 SAS 44.9 6.1K
6:10 SAS 44.8 6.1K
6:11 SAS 44.9 5.8K
6:12 SAS 44.9 6.0K
6:13 SAS 44.8 6.6K
6:14 SAS 44.7 6.5K
6:15 SAS 44.7 6.6K
6:16 SAS 44.7 6.7K
6:17 SAS 44.7 6.6K
6:18 SAS 44.7 6.4K
6:19 SAS 48.1 6.3K
6:20 SAS 48.1 6.3K
6:21 SAS 48.1 6.2K
6:22 SAS 48.1 6.2K
6:23 SAS 48 6.4K
6:24 SAS 48.1 6.2K
7:07 SAS 44.6 5.7K
7:08 SAS 44.6 5.8K
7:09 SAS 44.7 5.8K
7:10 SAS 44.7 5.9K
7:11 SAS 44.6 5.9K
7:12 SAS 44.5 5.7K
7:13 SAS 44.7 6.3K
7:14 SAS 44.6 6.2K
7:15 SAS 44.6 6.3K
7:16 SAS 44.6 6.3K
7:17 SAS 44.6 6.4K
7:18 SAS 44.6 6.2K
7:19 SAS 48.2 6.5K
7:20 SAS 48.2 6.2K
7:21 SAS 48.3 6.4K
7:22 SAS 48.2 6.3K
7:23 SAS 48.3 6.3K
7:24 SAS 48.2 6.4K

Why it's like that, only the Shadow knows.

Ah well, I'll take it as not something to be particularly concerned about, just one more of the weirdnesses of running a 6 year old 60 node cluster with none of the original nodes in it that's been upgraded from 6.x through to 7. 2

1.2K Posts

July 20th, 2015 07:00

Could be it. But even the HDD pools  in the S210  nodes strike me, how could one pool be 4% off?

Really look at the individual drives, too.

S210s:152                 152 D    1-7:bay7-12          -       15T /   34T (44%  )

S210s:153                 153 D    1-7:bay13-18         -       15T /   34T (44%  )

S210s:154                 154 D    1-7:bay19-24         -       16T /   34T (48%  )

205 Posts

July 20th, 2015 07:00

;-)

I used a command that I'm theoretically not supposed to know about to see that the disk pools are almost all within balance, except one of the S210 disk pools is >5% more full than some of the S200 disk pools within the same tier. Would that account for it?

Name                      Id  Type Members              VHS   Used / Size          

------------------------------------------------------------------------------------

perftier                  26  T    3,36,151             -      142T /  352T (40%  )

S200-bigssd               36  G    35,37-39             1       40T /   97T (41%  )

S200-bigssd:35            35  D    26-28,39-40,45,67-69 -      1.7T /  9.7T (17%  )

                                   :bay1-6                                         

S200-bigssd:37            37  D    26-28,39-40,45,67-69 -       13T /   29T (43%  )

                                   :bay7-12                                        

S200-bigssd:38            38  D    26-28,39-40,45,67-69 -       13T /   29T (43%  )

                                   :bay13-18                                       

S200-bigssd:39            39  D    26-28,39-40,45,67-69 -       13T /   29T (43%  )

                                   :bay19-24                                       

S200s                     3   G    18-22                1       52T /  122T (43%  )

S200s:18                  18  D    10,18,31-32,35-37,64 -      644G /  3.6T (18%  )

                                   -66:bay1-2                                      

S200s:19                  19  D    10,18,31-32,35-37,64 -       12T /   27T (44%  )

                                   -66:bay3-7                                      

S200s:20                  20  D    10,18,31-32,35-37,64 -       12T /   27T (44%  )

                                   -66:bay8-12                                     

S200s:21                  21  D    10,18,31-32,35-37,64 -       14T /   32T (43%  )

                                   -66:bay13-18                                    

S200s:22                  22  D    10,18,31-32,35-37,64 -       14T /   32T (43%  )

                                   -66:bay19-24                                    

S210s                     151 G    150,152-154          1       51T /  132T (38%  )

S210s:150                 150 D    1-7:bay1-6           -      4.4T /   30T (15%  )

S210s:152                 152 D    1-7:bay7-12          -       15T /   34T (44%  )

S210s:153                 153 D    1-7:bay13-18         -       15T /   34T (44%  )

S210s:154                 154 D    1-7:bay19-24         -       16T /   34T (48%  )

nltier                    47  T    41,76                -      3.1P /  3.9P (79%  )

NL400s_3tb                76  G    75,77-81             1      1.0P /  1.2P (81%  )

NL400s_3tb:75             75  D    16,21-25,49-55:bay1- -      170T /  211T (81%  )

                                   6                                               

NL400s_3tb:77             77  D    16,21-25,49-55:bay7- -      171T /  211T (81%  )

                                   12                                              

NL400s_3tb:78             78  D    16,21-25,49-55:bay13 -      170T /  211T (81%  )

                                   -18                                             

NL400s_3tb:79             79  D    16,21-25,49-55:bay19 -      171T /  211T (81%  )

                                   -24                                             

NL400s_3tb:80             80  D    16,21-25,49-55:bay25 -      171T /  211T (81%  )

                                   -30                                             

NL400s_3tb:81             81  D    16,21-25,49-55:bay31 -      171T /  211T (81%  )

                                   -36                                             

NL400s_4tb                41  G    40,42-46             1      2.1P /  2.7P (78%  )

NL400s_4tb:40             40  D    8-9,11-15,17,19-20,2 -      356T /  455T (78%  )

                                   9-30,34,41-42,48,70-                            

                                   74:bay1-6                                       

NL400s_4tb:42             42  D    8-9,11-15,17,19-20,2 -      356T /  455T (78%  )

                                   9-30,34,41-42,48,70-                            

                                   74:bay7-12                                      

NL400s_4tb:43             43  D    8-9,11-15,17,19-20,2 -      355T /  455T (78%  )

                                   9-30,34,41-42,48,70-                            

                                   74:bay13-18                                     

NL400s_4tb:44             44  D    8-9,11-15,17,19-20,2 -      356T /  455T (78%  )

                                   9-30,34,41-42,48,70-                            

                                   74:bay19-24                                     

NL400s_4tb:45             45  D    8-9,11-15,17,19-20,2 -      357T /  455T (78%  )

                                   9-30,34,41-42,48,70-                            

                                   74:bay25-30                                     

NL400s_4tb:46             46  D    8-9,11-15,17,19-20,2 -      355T /  455T (78%  )

                                   9-30,34,41-42,48,70-                            

                                   74:bay31-36                                     

------------------------------------------------------------------------------------

1.2K Posts

July 20th, 2015 07:00

Technically speaking, when looking at the Inode counts, we see a range from 5.7K to 6.7K which makes a relative difference of about 15% !

1.2K Posts

July 20th, 2015 07:00

> Unfortunately, running that command on a cluster with >1800 drives is probably not going to be particularly informative.

Scared of big data?

You can query the drives in a specific nodepool by copy-pasting

complex node ranges/lists right from the isi storagepool output:

isi statistics  drive --long --node 10,18,31-32,35-37,64-66

It is also possible to renumber nodes to that each

nodepool is a simple contiguous range, see the "lnnset" subcommand of isi config.

The Mediascan job should run automatically on the first

weekend of each month, please have it checked by support.

Cheers

-- Peter

205 Posts

July 20th, 2015 08:00

But does "out of balance" count inodes, LINs, or total size?

5 Practitioner

 • 

274.2K Posts

July 20th, 2015 08:00

I think the percentage on the pools can be +5 % or -5 % of the used capacity. That difference is acceptable. It does not count the inodes and LINs.

205 Posts

July 20th, 2015 09:00

OK, that's what I thought. If another multiscan comes up, I'll poke around with the disk pool sizes and see if I can figure out why.

93 Posts

July 20th, 2015 10:00

Hi Carlilek,

Might be worth checking /var/log/messages for unusual amounts of drive stalls; those cause group changes as well.  If so, open an SR.

Cheers,
Matt


5 Practitioner

 • 

274.2K Posts

July 20th, 2015 10:00

S210s:150                 150 D    1-7:bay1-6           -      4.4T /   30T (15%  )


S200s:18                  18  D    10,18,31-32,35-37,64 -      644G /  3.6T (18%  )


S200-bigssd:35            35  D    26-28,39-40,45,67-69 -      1.7T /  9.7T (17%  )


these three nodes in the nodes in the different node pools have the different used capacity. probably this is the reason why multiscan job is running. because the difference between the used capacity in the  nodepools is >5% .

No Events found!

Top