PowerScale: So führen Sie das On-Cluster Analysis-Tool aus
Zusammenfassung: Dieser Wissensdatenbank-Artikel enthält Anweisungen zur Verwendung des Analysetools Isilon On-Cluster (IOCA) und zur Interpretation der Ergebnisse.
Weisungen
Das Tool Isilon On-Cluster Analysis (IOCA) analysiert die Integrität eines ausgeführten PowerScale-Clusters und unterstützt* bei der Planung von Upgrades.
* IOCA ist nicht als Ersatz für die Überprüfung upgradebezogener Dokumente zur Vorbereitung von Upgradeaktivitäten gedacht. Weitere Informationen finden Sie im OneFS-Planungs- und -Verfahrensleitfaden für Upgrades: PowerScale OneFS – Planungs- und Verfahrensleitfaden für Upgrades
Die neueste IOCA-Toolversion steht zum Download unter Lightning zur Verfügung.
Die neueste Version wird heruntergeladen, obwohl das Datum auf dem Download älter ist.
Sie müssen auspacken IOCA und IOCA.sha256 mit den folgenden Elementen, sobald sie heruntergeladen und in Ihr Cluster übertragen wurden:
# tar -xvf IOCA.tar IOCA IOCA.sha256
Um die IOCA-Integrität zu überprüfen, können Sie die sha256-Datei im obigen tar-Paket haben, oder Sie können eine unabhängige IOCA.sha256 Datei , die hier zum Download gehostet wird.
Übertragen Sie nach dem Herunterladen an denselben Speicherort wie das IOCA-Skript (stellen Sie sicher, dass Sie die vorhandenen IOCA.sha256) enthalten.
Bei Verwendung von sha256sumverwalten:
# sha256sum -c /home/nyhanj1/IOCA.sha256 ./IOCA: OK
Beobachten Sie das Ergebnis wie oben fett dargestellt und bestätigen Sie, dass es OK ist.
Wenn Sie nicht über sha256sumverwalten:
# cat IOCA.sha256 a55c9efcea29776317d3b3ed36c504dcab08d1f945161f6ac6c8bbb315f31bb0 ./IOCA
# sha256 IOCA SHA256 (IOCA) = a55c9efcea29776317d3b3ed36c504dcab08d1f945161f6ac6c8bbb315f31bb0
Überprüfen Sie manuell, ob beide Prüfsummen übereinstimmen.
So führen Sie IOCA auf einem Cluster aus:
- Stellen Sie sicher, dass Sie mit dem richtigen Cluster verbunden sind. Führen Sie den folgenden Befehl aus, um Seriennummern im Cluster anzuzeigen, und überprüfen Sie, ob die Seriennummer des Service-Requests in der Liste angezeigt wird:
isi_for_array cat /etc/isilon_serial_number
- Erstellen Sie die
/ifs/data/Isilon_Support/Wenn es nicht vorhanden ist, und ändern Sie es, indem Sie die folgenden Befehle ausführen:
mkdir -pv /ifs/data/Isilon_Support
cd /ifs/data/Isilon_Support
-
Stellen Sie die neueste Version von IOCA auf dem Cluster im
/ifs/data/Isilon_Support/Verzeichnis:
- Wenn eine frühere Version vorhanden ist, überprüfen Sie, ob es sich um die neueste Version handelt, indem Sie den folgenden Befehl ausführen, und vergleichen Sie sie mit der Version, die zum Download verfügbar ist. Die Version wird oben im Skript für die Integritätsprüfung aufgeführt.
perl IOCA -v
- Führen Sie das IOCA-Tool aus und fügen Sie zusätzliche Argumente hinzu. Im folgenden Beispiel umfasst die Prüfung vor dem Upgrade Prüfungen für ein Upgrade auf 9.5.1.0 und zeigt nach den Ergebnissen der Integritätsprüfung eine Empfehlung für einen Upgradeplan an:
perl IOCA -u 9.5.1.0
- Erfassen Sie die Ausgabe und schreiben Sie sie im Service-Request.
- Überprüfen Sie die Integritätsprüfungen für alle identifizierten
FAILoderWARNMeldungen für potenzielle Probleme.
Interpretation der IOCA-Ausgabe:
Jedem Integritätsprüfungselement, für das ein Problem identifiziert wurde, ist in der Regel ein Wissensdatenbank-Artikel (KB) zugeordnet. Dieses Tool war ursprünglich ein rein internes Tool, einige der darin enthaltenen Artikel sind noch nicht extern zugänglich. Es wird daran gearbeitet, diese referenzierten Artikel einem externen Publikum zugänglich zu machen.
Im Folgenden finden Sie eine Beispielausgabe für einen Fehler bei der Hardwareüberwachung des Baseboard Management Controller (BMC) oder Chassis Management Controllers (CMC) auf einem Cluster:
BMC/CMC Hardware Monitoring FAIL
FAIL: Hardware monitoring issues detected on nodes: 2
INFO: 3 nodes have out of date CMC firmware versions: 1-3
INFO: Refer to KB489050 (https://support.emc.com/kb/489050) for details.
In diesem Beispiel wird eine FAIL Für Node 2 mit Problemen mit der Hardwareüberwachung. Es umfasst auch INFO , das anzeigt, dass die CMC-Firmwareversionen auf den Nodes 1 bis 3 veraltet sind. Schließlich enthält es einen Wissensdatenbank-Artikel mit Lösungsschritten.
Compliance-Modus
IOCA wird auf Clustern mit aktiviertem Compliance-Modus auf dieselbe Art und Weise wie auf Cluster ohne Compliance-Modus ausgeführt. Es ist nicht erforderlich, es mit auszuführen sudo. Allerdings ist die compadmin Der Nutzer muss der Eigentümer des IOCA-Skripts sein, um es ausführen zu können.
Im Folgenden finden Sie die Verwendung für IOCA und verfügbare Argumente oder Filter:
Usage: IOCA [options] [destination OneFS version] -d, --debug Display debugging information -e, --extra Displays extra details as part of each check -j, --json Displays output in JSON format -v, --version Displays current script version -h, --help Displays this help screen -r <checkName>, --run=<checkName> Executes only the specified check, can be included multiple times -u, --upgradeplan Includes an upgrade plan after health checks --rolling Provide rolling reboot plans --parallel Provide parallel reboot plans [where supported] --simultaneous Provide simultaneous reboot plans [excludes node firmware] -o, --onefs Supports the following comma separated options [ex. 8.1.2,simultaneous]: <version> Uses the provided destination OneFS version simultaneous Simultaneous OneFS upgrade parallel Parallel OneFS upgrade [requires 8.2.2+] rolling Rolling OneFS upgrade exclude-nf Upgrade plans will combine OneFS + node firmware by default [9.2 feature], this option disables that -p, --patches Supports the following comma separated options [ex. none,simultaneous]: none Opt out of patch recommendations simultaneous Simultaneous patch installs parallel Parallel patch installs [requires 9.1+] rolling Rolling patch installs -nf, --node-firmware Supports the following comma separated options [ex. 10.3.3,parallel]: <version> Uses the provided version for node firmware checks none Opt out of node firmware recommendations simultaneous Simultaneous node firmware updates [requires 8.2+] parallel Parallel node firmware updates [requires 8.2+] rolling Rolling node firmware updates -df, --drive-firmware Supports the following comma separated options [ex. 1.32]: <version> Uses the provided version for drive firmware checks none Opt out of drive firmware recommendations -vf, --verify-files Runs checks on files within specified location [ex. /ifs/data/] of certain file type [ex. .isi, .tgz, .tar.gz, .tar] <file location> default location is /ifs/data/ - specify the location where the upgrade files were placed
Weitere Informationen
Tabelle der einzelnen Prüfungen und der Name, der beim Ausführen des Befehls verwendet wird:
--run=CHECK
Das IOCA-Skript wird häufig aktualisiert. Um eine neue vollständige Liste der Prüfungen anzuzeigen, aktualisieren Sie auf die neueste IOCA-Version auf dem Isilon-Cluster und führen Sie den folgenden Befehl aus:
perl
onefs94-a-1# perl IOCA --run=CHECK
Isilon On-Cluster Analysis 0.1541
Requested checks, check, was not recognized. Available checks are: checkA100Root Checks if A100 node root mirrors need to be resized to 2GB checkAPIAuth Checks for API auth set to basic when upgrading from 8.1.2.0 to 9.2 or higher checkAccessZones Checks if multiple access zones are configured for upgrades across 7.1.1. Checks for nested or overlapping SMB shares. Adds warnings when any non-System access zone pools are found, only applies for upgrades across OneFS 7.1.1 checkAggregationMode Checks that the aggregation mode is not Legacy FEC mode for upgrades to OneFS 8+ checkAspera Checks if any Aspera services are enabled. If performing OneFS upgrades, it must be reinstalled post-upgrade checkAuthStatus Checks authentication status on each node. Warns if any authentication providers are not online or active. Checks for RFC2307 and Auto-locate of GID/UIDs and points to KB 000028577 checkBBUDegCap Checks for the level of BBU degradation on Gen6 nodes and flags any nodes with excessive degradation which would increase the risk of the node going into an RO state. checkBMCandCMC Checks for BMC/CMC related issues checkBXENodes Checks for existence of nodes with BXE interfaces, also checks for KB 000048172 & KB 000064027 known issues checkBootDisks Checks boot disk wear life remaining, firmware level, and historic error count checkCM6FWBug Checks drive firmware versions for the criteria of FCO F022318EE checkCMOSTimeCentury Check if the century configured in CMOS time matches the current century checkCapacity Verifies cluster capacity based on documented numbers the Upgrade Planning and Progress Guide. Warns if approaching checkCloudPools Checks for CloudPools related issues checkConfCmtSyntax Checks sysctl.conf for comments without leading # symbol that would pose an issue in parsing the conf file. checkContact Displays contact information configured in CELog when run with the --extra argument checkCoreDumps Checks for recent unexpected process restarts reported in /var/log/messages checkDTA000194434 Checks for criteria of KB 000194434 checkDestinationOneFS Checks destination OneFS version checkDiskpools Checks diskpools and class equivalence for OneFS upgrades going across 7.0 checkDriveFirmware Checks for out-of-date Drive Firmware and calls other related drive firmware checks checkDriveLoad Checks the current load on the drives checkDriveStallTimeout Checks current Drive Stall Timeout setting, recommended value is 3.5 seconds (3500000 microseconds) or higher checkDriveSupportPackage Checks for drive firmware updates available in the Drive Support Package checkDrivesHealth Checks health of drives and the drive stall timeout setting in sysctl checkET004252 Checks for criteria of ET004252 checkETAs Checks for Technical Advisories checkEmailSettings Displays E-mail settings configured in CELog when run with the --extra argument checkEncoding Checks exports and cluster configuration for if utf-8/default encoding checkEvents Checks events on all the nodes, failure if any critical events exist checkFCOF022318EE Checks drive firmware versions for the criteria of FCO F022318EE checkFCOF031617FC Checks drive firmware versions for the criteria of KB 000024620 checkFCOF042415EE Checks the cluster to see if it meets criteria for FCO F042415EE/KB 000051631 checkFileSharing Checks if Atime is enabled checkFilepoolPolicies Checks GNA requirements and checks filepools for final match being set and names starting with a number checkFirmwarePackages In OneFS 9.1 and later, confirms firmware packages are available checkFlush Checks for running flush processes / active pre_flush screen sessions on clusters checkGatewayPriority Checks for subnets with duplicate gateway priorities checkGroups Checks nodes for all enabled protocols. Fails if group info is reporting that an enabled protocol is not functioning on any node checkHDFS Display HDFS details, only useful when run with --extra checkHardening Checks if FIPS is enabled on node in the cluster, this needs to be disabled prior to upgrades to 9.5 or higher and re enabled after to avoid assessment failures checkHardwareStatus Checks battery health, power supplies, and gathers hardware details for use elsewhere checkHardwareUpgrade Checks if there is an in progress hardware upgrade checkHealth Verifies cluster health status and node health status checkIBInterfaces Checks for ib0/1 as being active, checks for ETA180317 IB switch firmware versions, and checks for overlapping IB networks checkIBPCIeSlot Checks if the InfiniBand card is installed in the wrong slot which may lead node start up issues during an upgrade to OneFS 9 and later releases checkIDI Checks for IDI errors in the past 90 days checkISCSI Checks for iSCSI LUNs being configured in /ifs/.ifsvar/iscsi/iscsi.conf (OneFS prior to 8.x only) checkIndexSnapshotCurrent Checks for current snapshots that are over 2 weeks old and may contirbute to capacity issues checkInternalPing Checks internal network by performing network ping operations checkJobHistory Checks job history for issues, currently just MediaScan issues checkJobStatus Checks for running jobs that would impact an upgrade checkJobs Checks jobs checkKB000066019 Checks size of reports.db and flags if over 100MB which may lead to issues outlined on KB 000066019 checkKB000081658 Checks for criteria of KB 000081658 checkKB000181818 Checks for criteria of KB 000181818 checkKB000192800 Checks for critera of KB 000192800 checkKB000196175 Checks for criteria of KB 000196175 checkKB000196762 Checks for criteria of KB 000196762 checkKB000197850 Checks for issues with IB queue pairs that would lead to node reboot issues if IB queue pairs are in a degraded state checkKB000212387 Checks Authentication providers msDS-SupportedEncryptionTypes attribute to ensure a value is set and assigned, if it is not, there is potential for DU after an upgrade to 9.5 or above. checkKB000213188 Checks for SED hardware where current version is less than 9.2 and destination version is 9.5 or higher. checkKB201488 Checks if any node meets criteria for KB 000201488 checkKB201666 Checks if it is necessary to perform the proactive workaround from KB 000201666 for a patch installation and whether the pre-requisites are met checkKB201933 Checks for criteria of KB 000201933 checkKB203381 Checks for criteria of KB 203381 checkKB220014 Checks for criteria of KB 220014 checkKB462202 Checks BootOrder in bios_settings.ini on Generation 5 nodes to determine if at risk for KB 000025523 checkKB489473 Checks if any node meets criteria for KB 000061983 checkKB490849 Checks if at risk for KB 000052089 checkKB496582 Checks for auth rules issues detailed in KB 000160596 checkKB496993 Checks if the cluster is at risk for KB 000061504 checkKB501267 Checks for the criteria of KB 000026510 checkKB507031 Checks for criteria outlined in KB 000035398 checkKB516613 Checks if any node meets criteria for KB 000057267 checkKB519119 Checks if nodes may be impacted by KB 519119 checkKB519388 Pre-upgrade check for issues outlined in KB 000162270 checkKB519423 Checks if the cluster config files are in a mixed mode checkKB519890 Checks for a known issue with LAGG interfaces in LACP mode when running OneFS 8.0.0.6, 8.0.1.2, 8.1.0.2, and 8.1.1.1 checkKB521778 Checks for criteria outlined in KB 000031948 checkKB521890 Checks for criteria outlined in KB 000167681 checkKB524082 Checks if the cluster is enabled for HTTP clients and flags a compatibility issue caused by a change in Apache versions checkKB527312 Check for criteria of KB 000166965 checkKB530050 Check for criteria of KB 000040987 checkKB533516 Checks if cluster uses an IP for AWS CloudPool accounts putting it at risk for DTA 533516 checkKB535582 Checks if at risk for KB 000060471 checkKB537785 Check for criteria of KB 000168829 checkKB540000 Checks for criteria of KB 000058599 checkKB540071 Checks for existence of files under /var/fw/fwpkg when no IsiFw package is installed checkKB540513 Checks for criteria of KB 000174074 checkKB540872 Checks if the cluster may encounter KB 000170982 during an upgrade from OneFS 8.2 releases checkKB540901 Checks boot disk partitions for any mismatches in uuids which may lead to boot failures checkKB544401 Check for critera of KB 000173157 checkKB544854 Check for criteria of KB 000173432 checkKB546604 Checks for criteria of KB 000180866 checkKerberos8000 Checks for an issue with the Kerberos configuration file when upgrading to OneFS 8.0.0.0 checkLACPSFP Checks for LACP on cxgb interfaces for KB 000174095 checkLWIODLog Checks /var/log/lwiod.log for known errors occuring in the last 30 days checkLastZoneID8000 Checks for gaps in access zones that may cause major issues when upgrading to OneFS 8.0.0.0 checkLeakFreeBlocks Checks for nodes with efs\.lbm\.leak_freed_blocks enabled. checkLegacyLDAP Checks for Legacy LDAP being enabled in OneFS 6 to OneFS 7 upgrades checkLicense Checks licenses and gives direction based on licensed features. InsigntIQ and vCenter licenses, gives information from compatibility guide. iSCSI, instructs to only perform simultaneous OneFS upgrades and that it is not supported in 8.0 checkLinMasterPadding Checks the LIN master padding to be all zeros checkListenQueue Checks for listen queue overflows to be less than 50,000 per node checkLogLevel Checks LWSM log levels for NFS, SMB, HDFS, and Authentication checkLogs Checks Log file presence and flags if any log file specified in list is not present checkMaintenanceMode Checks if the cluster is currently in maintenance mode checkMemory Checks each DIMM to meet criteria outlined in KB 000041666 and if the expected (per product info line) matches closely the reported RAM checkMessagesLog Checks /var/log/messages.log for known errors occuring in the last 30 days checkMirrors Checks the boot mirror health checkNDMP Checks for running NDMP sessions checkNDMP16GB Checks for LNN changes that have occurred since the isi_ndmp_d processes started which can cause issues during the HookDataMigrationUpgrade phase of an OneFS upgrade checkNDMPUpgradeTimeout Checks for LNN changes that have occurred since the isi_ndmp_d processes started which can cause issues during the HookDataMigrationUpgrade phase of an OneFS upgrade checkNFS Uses nfsstat to identify RPC errors checkNetBIOS Checks if the Isilon NetBIOS Name Service (nbns) is enabled when updating to OneFS 8.0.1 and later checkNetstat Checks connections counts for specific protocols via netstat checkNetworkParallelUpgrade Checks for the risk of inaccessible network pools during a parallel upgrade checkNetworkPoolIFaces Checks each network pool and their assigned interfaces, if only 1 interface is configured for any pool and IP Ranges are set, this will cause a failure for pre upgrade mandatory checks checkNodeCompatibility Checks node compatibility for OneFS upgrades by comparing it against known supported versions checkNodeFirmware Checks node firmware for updates checkNodesInstalled Checks for nodes installed to display in an agregated list for visibility checkOneFSVersions Checks running version and target version for any issues. Failure: Any version mismatch across all nodes checkOpenFiles Checks each node for the number of open files [sysctl kern.openfiles] and compares that against the number of max open files [sysctl kern.maxfiles]. The check will warn at over 80% of the max and fail at over 90% of the max checkPSCALE136276 Check for criteria of PSCALE-136276 checkPartitions Checks system partition space checkPatches Checks for highly recommended patches, on the current version if no target version, or for the target version checkPerformance Checks cluster performance checkProcesses Checks for OpenSM master, MCP, isi_mca_dump, and isi_upgrade_d process related issues checkProtectionLevel Check storage pools protection levels checkRealACL Checks if Real ACLs are configured / set on /ifs/.ifsvar or ifs/.ifsvar/patch. These should not be set and if they are, it would cause an issue for upgrades / installs checkRemoteSupport Checks is Restricted Shell is enabled and isi_supportassist, if both are enabled and you upgrade to 9.7 then it will cause an issue for support assist services to restart. checkRoutingTables Displays routing table for each node checkSBR8000 Checks if SBR is enabled pre-upgrade for OneFS upgrades targeting 8.0.0.0/1 checkSNMPDConfig Checks both SNMPD.config and also isilon_serial_number to make sure they are not 0 bytes checkSPNs Displays a list of SPNs, only useful when run with --extra checkSRS Checks the remote connectivity configuration for issues checkSSHDConfig Checks the /etc/mcp/templates/sshd_config files for known issues checkSWIFTAccounts checks for SWIFT accounts which is then used to set prioirity of flag when SWIFT is licensed and an upgrade to 9.5 or higher is being performed checkServices Checks common services to make sure they are in the expected states checkServicesMonitoring Checks that enabled services are being monitored checkSmartConnect Check that the SmartConnect Service IPs are all assigned and not being used for client connectivity checkSnapshot Checks if the Snapshot count is approaching cluster limit of 20,000, whether Autodelete is set to yes, and checks snapshot logs. Checks snapshot logs for EIN/EIO/EDEADLK/Failed to create snapshot checkStaticRouteConflict Checks for conflicting static routes checkStoragePools Check storage pools for health/capacity/unprovisioned drive issues checkSupportability Checks cluster hardware and software supportability checkSwitchCompatibility Checks backend Dell switches to confirm they are at least version 10.5.0.6 checkSymLink Checks is /var/patch/catalog or /var/patch/tmp are Sym Linked or if catalog is a file instead of directory. checkSyncIQ Gathers source and target SyncIQ information and reports partners of SyncIQ. Checks for an issue where having a large number of SyncIQ report files causes a tar process to delay other upgrade processes potentially leaving the cluster in a temporary DU scenario for an extended period of time checkSystemFlag checks for diskpools with system flag set checkTimeDrift Checks time drift between nodes checkTimeSync Checks if the cluster is enabled to sync to an external server checkTimeZone Checks for missing time zone in target OneFS code level checkUIDGID Checks for UID/GID values greater than 262143 for files located in / and /var checkUpgrade Checks for in progress upgrade related issues. Warns if the isi_upgrade_d service is enabled. Fails if not in a committed state. Fails if an upgrade activity is already in progress.Checks the fs_fmt_version, an odd or zero fs_fmt_version is problematic checkUpgradeAgentPort Checks the port used by the isi_upgrade_agent_d daemon to ensure it is not in use by other processes checkUpgradePath Checks where multiple jumps are required and provides specifics required checkUptime Checks node uptime, warning over 200 days, flags uptime ETAs checkVaultCard Checks for the presence of the M.2 vault card in Generation 6 nodes and confirms SMART status threshold has not been exceeded on that device checkZoneLocalAuth For upgrades to OneFS 8.2 and later, checks if local providers are associated with other access zones
Nachfolgend finden Sie einige empfohlene Ressourcen zu diesem Thema, die hilfreich sein könnten: