This post is more than 5 years old
1 Rookie
•
68 Posts
0
5219
May 7th, 2015 10:00
job autobalance takes long time
Hi Everyone
I have a cluster with 39 nodes IQ5400s and 5nodes s200
my free space is almost from 1% to 5%
onefs 7.1.0.3
I have a job autoblance running
Running jobs:
Job Impact Pri Policy Phase Run Time
-------------------------- ------ --- ---------- ----- ----------
FSAnalyze[3444] Low 6 LOW 1/2 6:25:36
AutoBalance[3420] Low 4 LOW 2/5 20d 20:21
if I run the report i see this
Protele-27# isi job reports view --id=3420
AutoBalance[3420] phase 1 (2015-04-16T15:11:46)
-----------------------------------------------
Elapsed time: 7149 seconds
Errors: 0
Drives: 588
LINs: 81352318
Size: 652005240885749
Eccs: 0
CPU usage: max 56% (dev 18), min 0% (dev 2), avg 10%
Virtual memory size: max 106628K (dev 6), min 105476K (dev 2), avg 105624K
Resident memory size: max 11312K (dev 47), min 10308K (dev 13), avg 10665K
Read: 11049621 ops, 166869344768 bytes (159139.0M)
Write: 22044495 ops, 180588503040 bytes (172222.6M)
if I run the same command again I see the same number of LINs
I paused the job and after I resumed the job, and I see the same number of LINs
Is my job autobalance run normal?
How many days will the job finish?
do you recommend to wait until 30 days to finish the job?
do you recommend to stop the job and after could I start a new job?
Thanks



swi1
1 Rookie
•
33 Posts
0
May 7th, 2015 15:00
Auto Balance job takes forever. Basically it depends on number of files on Isilon. Give more time to run the job.
mattashton1
93 Posts
1
May 7th, 2015 15:00
...and up the priority if you can, especially during off hours.
Cheers,
Matt
Peter_Sero
4 Operator
•
1.2K Posts
1
May 8th, 2015 00:00
isi job list -v
gives you the current progress, unlike isi report..., which shows info only for the finished job phases.
Have you considered upgrading to 7.1.1.2 for improved overall performance?
hth
-- Peter
Paconet1
1 Rookie
•
68 Posts
0
May 8th, 2015 12:00
Hi Peter
Thanks for your answer
let me check the command
because I thought that the command isi job reports view --id=3420
it will give the progress of job
Paconet1
1 Rookie
•
68 Posts
0
May 8th, 2015 12:00
Thanks for your answer
is the free space important for job auto balance completed his work?
Thanks
Paconet1
1 Rookie
•
68 Posts
0
May 8th, 2015 12:00
Thanks for your answer
I can't up the priority, because we want the same performance all day
swi1
1 Rookie
•
33 Posts
0
May 9th, 2015 23:00
Since you have only 5% left, free space plays role. It may consume more time than expected. Do you have a plan to buy new nodes or delete files on the cluster?
Paconet1
1 Rookie
•
68 Posts
0
May 11th, 2015 10:00
Hi
No yet, this is temporal
thanks four your interest
Paconet1
1 Rookie
•
68 Posts
0
May 11th, 2015 14:00
Hi
I am monitoring the job I am showing from may 8 to 11 may
may 7
Protele-6# while :; do date; isi job list -v; isi job list -v; isi job list -v ; sleep 300;done
Fri May 8 18:40:38 CDT 2015
ID: 3420
Type: AutoBalance
State: Running
Impact: Low
Policy: LOW
Priority: 4
Phase: 2/5
Start Time: 2015-04-16T13:12:37
Running Time: 22d 1h 50m
Participants: 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 26, 27, 28, 29, 31, 32, 33, 34, 35, 36, 37, 38, 39, 43, 44, 45, 46, 47, 48, 49, 50, 51, 55, 56, 57
Progress: Processed 25015300 lins; 0 zombies and 0 errors
Waiting on job ID: -
Description:
may 11
Mon May 11 12:00:15 CDT 2015
ID: 3420
Type: AutoBalance
State: Running
Impact: Low
Policy: LOW
Priority: 4
Phase: 2/5
Start Time: 2015-04-16T13:12:37
Running Time: 24d 12h 2m
Participants: 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 26, 27, 28, 29, 31, 32, 33, 34, 35, 36, 37, 38, 39, 43, 44, 45, 46, 47, 48, 49, 50, 51, 55, 56, 57
Progress: Processed 25203039 lins; 0 zombies and 0 errors
Waiting on job ID: -
Description:
Peter_Sero
4 Operator
•
1.2K Posts
1
May 14th, 2015 21:00
That's an awfully slow progress...
We don't know the overall load and health condition of your cluster.
I found that the new 7.1 job engine does a very good job in
not slowing down the front-end NAS traffic, so giving a job MEDIUM
if not HIGH impact is mostly a safe thing to do... we have NL nodes, 3TB drives...
HOWEVER you should really be concerned about the free space on your cluster,
as you have suspected already.
Referring to the Isilon performance talk at the recent EMC Word,
overall performance degradation can start when the cluster 90% full and
can become severe from 95%... !
Cheers
-- Peter
Paconet1
1 Rookie
•
68 Posts
0
June 22nd, 2015 16:00
Hi I send you when the job succeeded
Thanks for your answers
Recent job results:
Time Job Event
--------------- -------------------------- ------------------------------
06/09 01:21:55 FSAnalyze[3484] Succeeded (LOW)
06/08 05:07:08 AutoBalance[3420] Succeeded (LOW)