Unsolved
This post is more than 5 years old
718 Posts
1
22551
Ask the Expert: The What, Why and How of Isilon Job Engine
|
|
Ask The Expert – Isilon’s New Releases: IsilonSD Edge, OneFS.NEXT and CloudPools |
Welcome to the EMC Support Community Ask the Expert conversation. On this occasion we will be covering EMC Isilon Job Engine. Among the many areas we'll be discussing our experts will answer your questions in regards what is Isilon Job Engine, Why it’s important to know, how to tweak them, and any issue with job engine.
Meet Your Experts:
Technical Account Manager - EMC Rash works on providig solution and support to Isilon customers via proactive account management. His customers are breaking the storage boundaries everyday by capacity and performance, and that comes with challenges. Prior to Isilon Rash worked on many different products in different roles and capacity -- Team lead – Designing new services. Solution Architect and Installation - EMC Block and File Storage (VNX, Clariion), Visualization Platform (VMware, Hypervisor), Fibre Channel and Network protocols. System Admin -- Sun Microsystem, VERITAS Products (Cluster, Volume Manager), Linux, UNIX flavors, writing shell scripts. |
|
This discussion takes place from May 18th - June 5th. Get ready by bookmarking this page or signing up for e-mail notifications.
Share this event on Twitter or LinkedIn:
>> Join the Ask the Expert: The What, Why and How of Isilon Job Engine http://bit.ly/1EIHONl #EMCATE<<
Peter_Sero
1.2K Posts
0
June 3rd, 2015 08:00
Thank you! (in hope of MultiScanLin...)
Go.Y
287 Posts
0
June 4th, 2015 23:00
Rash,
When will be SSD balance with only SSD disk pools?
As far as I understand, currently Autobalance will balance HDD,SDD in node total.
rash_vyas1
17 Posts
0
June 6th, 2015 04:00
Hi go.y,
There were some challenges in 6.5.x code with respect to balancing metadata and inodes in SSD. In 7.x Isilon changed the architecture and created disk pools – SSD being a separate diskpool. Idea was that should fixed the issue found in 6.5.x. In most cases 7.x should fix the issue. However, in few instances it was found that SSD is not balanced in 7.0.2.4 code. There were some changes done in 7.1.1.x code and I've not seen SSD imbalance in 7.1.1.x code.
I would first recommend upgrading to 7.1.1.2 (latest recommended OneFS code) or if you wait for 1 or 2 months you could go with OneFS 7.2.0.2 which is our GA release and candidate to become target code.
If you still see SSD imbalanced let me know what OneFS you’re running, and if there is SR open. I would follow-up with engineering team and update this thread.
Rash
EMC | Isilon
Technical Account Manager
RobertoAraujo1
718 Posts
0
June 8th, 2015 11:00
This Ask the Expert event has officially ended, but don't let that retract you from asking more questions. At this point our SME are still welcomed to answer and continue the discussion though not required. Here is where we ask our community members to chime in and assist other users if they're able to provide information.
Many thanks to our SMEs who selflessly made themselves available to answer questions. We also appreciate our users for taking part of the discussion and ask so many interesting questions.
ATE events are made for your benefit as members of ECN. If you’re interested in pitching a topic or Subject Matter Experts we would be interested in hearing it. To learn more on what it takes to start an event please visit our Ask the Expert Program Space on ECN.
Yan_Faubert
117 Posts
0
July 6th, 2015 06:00
To share my recent experience on balancing usage on SSD drives that are used for metadata acceleration.
Cluster is running OneFS version 7.1.0.5 and we had to smartfail and re-add some SSD (for a separate reason) and obviously this caused an imbalance in the usage. We started an AutoBalanceLin job and it did re-balance the usage on the SSD.
Before starting the job we had 5 nodes with this usage:
After letting the job run for ~ 5 days we were now in this situation with a much better balance:
Paconet1
68 Posts
0
December 29th, 2016 15:00
What are the main causes that the all jobs are paused at the same time including the flex protect?,
and if you want to launch one job more automatically will be paused including flexprotect?
Why the system paused all jobs?
what must I do for quit of that status(all paused jobs)?
how could you run a job in a degraded mode?
Thanks
RobertoAraujo1
718 Posts
0
January 3rd, 2017 09:00
Hi Francisco, ideally Rash, if he's still around, may be able to address your questions, but if not, I moved this thread to the Isilon Product community in case there's a community member who can help.
Happy New Year!
rash_vyas1
17 Posts
3
January 3rd, 2017 09:00
What are the main causes that the all jobs are paused at the same time including the flex protect?,
FlexProtect would pause all the jobs except you've job engine tweaked.
If FlexProtect job is also paused then something is wrong with job engine -- isi_job_d may not be running or one of the node is in readonly mode or down or cluster is unable to connect to one of the node via backend (IB). At this stage I would ask you to log support case and have support work at it. I may write troubleshooting steps, but I don't know user's experience level, so it will be best for support to fix it.
and if you want to launch one job more automatically will be paused including flexprotect?
Isilon job engine is written in a way to give top most priority to Data Integrity and hence when a drive or a node is in Smartfail status OneFS would run FlexProtect and reprotect data. You could pause FlexProtect job and run other job by removing job engine from "Degraded" mode, but at this stage again I would ask you to check with support because you need to know protection level on the cluster, what's in smartfail status, and reason to pause FlexProtect
Why the system paused all jobs?
See answer to 1st question
what must I do for quit of that status(all paused jobs)?
See answer to 1st question
how could you run a job in a degraded mode?
You need to take job engine out of degraded mode. Again I can't share these commands on a public forum as changes to job engine without proper knowledge could cause other issues. Please log a support case, and if there is a reason support would make those changes.
Hope this helps!
Paconet1
68 Posts
0
April 28th, 2017 10:00
Thanks for your help, Rash and Nestor
I have the next questions I start the job flexprotect, multiscan, integrityscan, What is the reason that the jobs failed, and it start again and again?
Thanks
Paconet1
68 Posts
0
December 30th, 2017 18:00
Thanks Roberto and Rash for your answer
the issue was: because I have 3 nodes down and My protection was n+2 and I couldn't run the job flexprotect.
I needed to put the cluster in mode degraded to run the job flexprotect , you was right
Thanks