Pegasus2 R8 very slow on MacOS Monterey / MacPro Late 2013 Trash Can

  • 237 Views
  • Last Post 23 January 2024
Francesco Rizzo posted this 17 January 2024

Need some advise and recommendations please on how to further troubleshoot and regain I/O performance please.

Been having really poor performance to my Pegasus2 R8 array ever since upgrading to MacOS Monterey (12.7.2) on my MacPro Late 2013 Trash Can.  We're talking 322 KB/s slow... for a 1GB worth of 1MB sequential writes.

It's so bad that updates to my Apple Photos app library (on the array) via iCloud sync and the resulting Sophos AV scanner process disk IO ends up blocking the system badly enough to cause the MacOS watchdog process to reboot the system and file a crash report with Apple.  System is completely stable and performing just fine without the array connected.

Test to MacPro SSD: (1,323 MB/s)

$ dd if=/dev/zero of=~/ddtestfile.out bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 0.773917 secs (1387412118 bytes/sec)

Test to Pegasus2 R8: (0.3151 MB/s)

$ dd if=/dev/zero of=/Volumes/FRP\ Master/ddtestfile.out bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 3249.976362 secs (330385 bytes/sec)

BlackMagic Disk Speed tests also end up failing/timing-out.

 

The array is at 97% capacity.  I am wondering if this is the issue?

 

Otherwise, the array and drives all report operational without any errors, logical volume is RAID6 across all 8 drives.

Running the latest/last firmware and the latest Pegasus32  Promise Utility install.

  • Promise Utility version: 4.06.0000.04 Build Date: Mar 28, 2022
  • FirmwareVersion: 5.04.0000.64  FirmwareBuildDate: Dec  5, 2019

I DO see a Promise system extension installed... not sure if this is part of the above software or a left-over from years of OS upgrades...  The support page does say that the system extension is included with MacOS 11.x...

$ kextstat | grep promise
Executing: /usr/bin/kmutil showloaded
No variant specified, falling back to release
   86    0 0xffffff800348a000 0xb000     0xb000     com.promise.driver.stex (6.2.13) AE253556-66D8-38D2-B9D1-179B78F29153 <85 16 7 6 3>
$ promiseutil

cliib> ctrl -v ------------------------------------------------------------------------------- CtrlId: 1 Alias: OperationalStatus: OK PowerOnTime: 24 hours 37 minutes LUNAffinity: N/A LunmappingMethod: WWN Based CacheUsagePercentage: 0% DirtyCachePercentage: 0% Vendor: PROMISE Model: Pegasus2 R8 PartNo: F29DS8722000000 SerialNo: N/A HWRev: B3 WWN: 2000-0001-5558-2057 CmdProtocol: SCSI-3 CBSN: M92H14310800199 MemType: DDR2 SDRAM MemSize: 512MB FlashType: Flash Memory FlashSize: 8MB NVRAMType: FRAM NVRAMSize: 128KB BootLoaderVersion: 5.04.0000.64 BootLoaderBuildDate: Dec 5, 2019 FirmwareVersion: 5.04.0000.64 FirmwareBuildDate: Dec 5, 2019 SoftwareVersion: 5.04.0000.64 SoftwareBuildDate: Dec 5, 2019 BIOSVersion: 5.04.0000.64 BIOSBuildDate: Dec 5, 2019 SingleImageVersion: 5.04.0000.64 SingleImageBuildDate: Dec 5, 2019 CtrlCPLDVersion: 6.08 ChipType: LCMXO2-640HC DiskArrayPresent: 1 OverallRAIDStatus: OK LogDrvPresent: 1 LogDrvOnline: 1 LogDrvOffline: 0 LogDrvCritical: 0 PhyDrvPresent: 8 PhyDrvOnline: 8 PhyDrvOffline: 0 PhyDrvPFA: 0 GlobalSparePresent: 0 DedicatedSparePresent: 0 RevertibleGlobalSparePresent: 0 RevertibleDedicatedSparePresent: 0 RevertibleGlobalSpareUsed: 0 RevertibleDedicatedSpareUsed: 0 WriteThroughMode: No MaxSectorSize: 4KB PreferredCacheLineSize: 64 KB CacheLineSize: 64 KB Coercion: Enabled CoercionMethod: GBTruncate SMART: Enabled SMARTPollingInterval: 10 minutes MigrationStorage: DDF CacheFlushInterval: 3 second(s) PollInterval: 15 second(s) AdaptiveWBCache: Enabled ForcedReadAhead: Enabled PowerSavingLevel: 1 SpindownType: All drives IdleTimeToParkRwHeads: 30 minutes IdleTimeToLowerRotationSpeed: Never IdleTimeToSpinDown: Never SGPIOBackPlane: Default/Generic (0) SASReadyLED: Off SpinUpDelay: 0 millisecond PowerManagement: Enable

cliib> logdrv -v ------------------------------------------------------------------------------- LdId: 0 ArrayId: 0 SYNCed: Yes OperationalStatus: OK Alias: FRP Master TRIMSupport: Not Support TRIM SerialNo: redacted WWN: 22a5-0001-5510-612c PreferredCtrlId: N/A RAIDLevel: RAID6 StripeSize: 1MB Capacity: 48.01TB PhysicalCapacity: 64.01TB ReadPolicy: ReadAhead WritePolicy: WriteBack CurrentWritePolicy: WriteBack NumOfUsedPD: 8 NumOfAxles: 1 SectorSize: 512Bytes RAID5&6Algorithm: right asymmetric (4) TolerableNumOfDeadDrivesPerAxle: 2 ParityPace: wide pace (2) CodecScheme: Q+Q

cliib> stats ------------------------------------------------------------------------------- Controller Statistics ------------------------------------------------------------------------------- ControllerId: 1 DataTransferred: 1.51TB ReadDataTransferred: 1.46TB WriteDataTransferred: 51.44GB Errors: 0 NonRWErrors: 0 ReadErrors: 0 WriteErrors: 0 IORequest: 48,129,226 NonRWRequest: 25,069 ReadIORequest: 44,886,622 WriteIORequest: 3,217,535 StatsStartTime: Jan 16, 2024 17:06:33 StatsCollectionTime: Jan 17, 2024 17:27:02

Order By: Standard | Latest | Votes
R P posted this 17 January 2024

Hi Francesco,

There's little here of use for debugging other than the fact that the filesystem is amost full. But this may well be the issue.

This article may be of some help...

External Hard Drives running very slow if they get too full - Any fixes?

The recommendation, if HFS+, is to defragment. You may need to free up some space to do so.

Some other suggestions

1. Check the SMART data to see if any of the drives are failing

2. Check the read speeds as well, read speeds should be only slightly slower than normal if the disk has high fragmentation. If read speeds are also slow there may be some other issue.

 

Francesco Rizzo posted this 18 January 2024

Thanks RP. 

Agreed, nothing seems "wrong" other than I/O performance.  Did try swapping cables and each of the three thunderbolt controllers to no avail.

read with dd on a mac is a bit limited and doesn't seem to support 'iflag' or 'oflag" value of 'direct' to circumvent the cache.

Testing Read: (7118.5 MB/s)

$ dd if=/Volumes/FRP\ Master/ddtestfile.out of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 0.143849 secs (7464367663 bytes/sec)

Even trying to select the Target Drive in BlackMagic Disk Speed Test ends up beachballing the system for over 90 seconds and then aborts the selection dialog...

Is there any way to trace the blocking I/O to see why everything's hanging?

At this point performance is so bad there's no practical way to back-up the 42 TB on the array, or even 4-5 TB to presumably??? make enough space for the array to respond efficiently.

 

What's the guidance on maximum utilization?  on a 48TB array with a 44TB APFS container volume?

$ diskutil info /dev/disk3s1
   Device Identifier:         disk3s1
   Device Node:               /dev/disk3s1
   Whole:                     No
   Part of Whole:             disk3

   Volume Name:               FRP Master
   Mounted:                   Yes
   Mount Point:               /Volumes/FRP Master

   Partition Type:            41504653-0000-11AA-AA11-00306543ECAC
   File System Personality:   APFS
   Type (Bundle):             apfs
   Name (User Visible):       APFS
   Owners:                    Enabled

   OS Can Be Installed:       Yes
   Media Type:                Generic
   Protocol:                  SAS
   SMART Status:              Not Supported
   Volume UUID:               5972A92F-D57B-4BC7-901A-E40102C06578
   Disk / Partition UUID:     5972A92F-D57B-4BC7-901A-E40102C06578

   Disk Size:                 48.0 TB (48005776367616 Bytes) (exactly 93761281968 512-Byte-Units)
   Device Block Size:         4096 Bytes

   Container Total Space:     48.0 TB (48005776367616 Bytes) (exactly 93761281968 512-Byte-Units)
   Container Free Space:      1.5 TB (1480083927040 Bytes) (exactly 2890788920 512-Byte-Units)
   Allocation Block Size:     4096 Bytes

   Media OS Use Only:         No
   Media Read-Only:           No
   Volume Read-Only:          No

   Device Location:           External
   Removable Media:           Fixed

   Solid State:               Info not available
   Hardware AES Support:      No

   This disk is an APFS Volume.  APFS Information:
   APFS Container:            disk3
   APFS Physical Store:       disk2s2
   Fusion Drive:              No
   Encrypted:                 No
   FileVault:                 No
   Sealed:                    No
   Locked:                    No

 

Francesco Rizzo posted this 22 January 2024

RP,

I have removed most of the data from the array (backups of NAS data, FWIW), and performance is better.

Is there an acknowledged threshold for how much of the RAID volume's capacity cannot be used without degrading performance?

Retest of Array write (330 MB/s)

$ dd if=/dev/zero of=/Volumes/FRP\ Master/ddtestfile.out bs=1M count=1024

1024+0 records in

1024+0 records out

1073741824 bytes transferred in 3.092444 secs (347214638 bytes/sec)

 

BlackMagic Design Disk Speed Test clocks in now at 517 MB/s read & 430 MB/s write.

This is still much slower than when the volume was newly created.

 

Are there any steps I can take - short of completely recreating the array config from scratch and reformatting a fresh filesystem to restore the performance?

 

 

$ df -h /Volumes/FRP\ Master/

Filesystem     Size   Used  Avail Capacity iused        ifree %iused  Mounted on

/dev/disk5s1   44Ti   10Ti   33Ti    24% 2902553 357351767320    0%   /Volumes/FRP Master

 

The array logical volume is mounted and formattted as APFS, not HFS.

/dev/disk4 (external, physical):

   #:                       TYPE NAME                    SIZE       IDENTIFIER

   0:      GUID_partition_scheme                        *48.0 TB    disk4

   1:                        EFI ⁨EFI⁩                     209.7 MB   disk4s1

   2:                 Apple_APFS ⁨Container disk5⁩         48.0 TB    disk4s2

 

/dev/disk5 (synthesized):

   #:                       TYPE NAME                    SIZE       IDENTIFIER

   0:      APFS Container Scheme -                      +48.0 TB    disk5

                                 Physical Store disk4s2

   1:                APFS Volume ⁨FRP Master⁩              11.4 TB    disk5s1

 

 

Hi Francesco,

There's little here of use for debugging other than the fact that the filesystem is amost full. But this may well be the issue.

This article may be of some help...

External Hard Drives running very slow if they get too full - Any fixes?

The recommendation, if HFS+, is to defragment. You may need to free up some space to do so.

Some other suggestions

1. Check the SMART data to see if any of the drives are failing

2. Check the read speeds as well, read speeds should be only slightly slower than normal if the disk has high fragmentation. If read speeds are also slow there may be some other issue.

R P posted this 23 January 2024

Hi,

Is there an acknowledged threshold for how much of the RAID volume's capacity cannot be used without degrading performance?

I've generally heard that keeping the filesystem no more than 60-70% full is best for performance.

Close