Error - Disks too slow reported but not the case...??
Hi Ben / knowledgeable operators
In the last couple of releases of the software, I have started to get a glitch occur. I get the error window up and a massive list of these pouring in (in batches). Sometimes nothing for a few days, other times, non stop for a day. The thing is, the hardware hasn't changed and running a speedtest on the 6Gb SAS 12TB formatted RAID6 array (at the same time as the 5 cameras are recording to it) I am getting a consistent throughput of 570MB write and 630MB read. This was over many tests using BlackMagicDesign Disk Speed Test. All drives 'smart' report ok, array integrity is 100% and all other services on the server are working normally. Manually copying files onto the array and back is very fast. There is about 6% free space on the Array (as managed by SecuritySpy). The System Drive is a 6Gb SAS SSD with 60% Free, and 24GB RAM with plenty free. There are 2 x Quad Core Xeon CPU's humming along with about 80% idle for the most. Everything is CAT6 gig linked (some cameras are only 100Mb though) all routed through a Cisco 3750G-48. The Xserve is running 10.11.6 with Svr5.1.7 and has been for a couple of years now (10.9 for a few years prior to that then 10.6 prior to that). This issue only started about 3-4months ago. I an running SS 4.2.9. I have run SS for the last 6 or so years on the same core hardware, upgrading components and OS over time. My hardware/network has not changed in the last 3+ years bar the router/firewall about a year ago.
The Error is:
Error performing continuous capture for the camera "xx" continuous capture mode has been disarmed. Failed to record video frame 1556,98002. The disk is too slow and cannot keep up with writing data - you may meed to reformat or replace the disk.
The odd thing is it is the same frame# and will list all the different cameras in the 'xx' section.
Any thoughts on this? Would it be helpful to send over any files/logs?
Many thanks
Oli
In the last couple of releases of the software, I have started to get a glitch occur. I get the error window up and a massive list of these pouring in (in batches). Sometimes nothing for a few days, other times, non stop for a day. The thing is, the hardware hasn't changed and running a speedtest on the 6Gb SAS 12TB formatted RAID6 array (at the same time as the 5 cameras are recording to it) I am getting a consistent throughput of 570MB write and 630MB read. This was over many tests using BlackMagicDesign Disk Speed Test. All drives 'smart' report ok, array integrity is 100% and all other services on the server are working normally. Manually copying files onto the array and back is very fast. There is about 6% free space on the Array (as managed by SecuritySpy). The System Drive is a 6Gb SAS SSD with 60% Free, and 24GB RAM with plenty free. There are 2 x Quad Core Xeon CPU's humming along with about 80% idle for the most. Everything is CAT6 gig linked (some cameras are only 100Mb though) all routed through a Cisco 3750G-48. The Xserve is running 10.11.6 with Svr5.1.7 and has been for a couple of years now (10.9 for a few years prior to that then 10.6 prior to that). This issue only started about 3-4months ago. I an running SS 4.2.9. I have run SS for the last 6 or so years on the same core hardware, upgrading components and OS over time. My hardware/network has not changed in the last 3+ years bar the router/firewall about a year ago.
The Error is:
Error performing continuous capture for the camera "xx" continuous capture mode has been disarmed. Failed to record video frame 1556,98002. The disk is too slow and cannot keep up with writing data - you may meed to reformat or replace the disk.
The odd thing is it is the same frame# and will list all the different cameras in the 'xx' section.
Any thoughts on this? Would it be helpful to send over any files/logs?
Many thanks
Oli
Comments
This message does indicate a real problem. SecuritySpy has a large memory buffer of data that is used to buffer writes to the drive and smooth out temporary drive slow-downs, but when this gets full because the drive can't keep up with the amount of data that is being attempted to be written to it, there is nothing SecuritySpy can do but to stop recording and throw this error. This happens for all cameras at the same time, presumably because they are writing to the same drive.
You could try our own file writing speed test app that simulates the typical kind of file writing that SecuritySpy performs - it would be interesting to get the results from this test.
One possibility is that you have one drive that is slowly going bad, and it performs well most of the time until it gets to one of its bad sectors, and then slows down dramatically.
Also, check for things like Time Machine backups and Spotlight indexing, both of which can significantly slow down drives. You should add the drive to the Exclude list in the Time Machine system preference as well as the Privacy section of the Spotlight system preference.
In any case, I am very confident that this is not a bug in SecuritySpy, but does actually represent a real issue.
Is there any insight to this?
It is not a write speed issue since my write speeds are 3500mb
File Count: 16
Test complete.
Time taken: 16.54 seconds
Amount of data written: 5130 MB
Average data rate: 310.21 MB/s
The disk is an SSD Evo 2tb connected through usb3.
I have 9 HD cameras, I have limited the bitrate and turned down the frame rate to 12fps from 15fps with the same result.
I am still getting this error on a daily basis to all 9 cameras within the same second, although there is no issue with the records.
This is the error:
03/02/2019 18:50:28: Error performing continuous capture for the camera "XXX", continuous capture mode has been disarmed. 4.2.9,1556,98002 Failed to record video frame. The disk is too slow and cannot keep up with writing data - you may need to reformat or replace the disk.
SS writes to an external drive connected over USB 3.0 and I have found that it happens when my Mac is doing something pretty processor intensive, be that decompressing multiple large files, or transferring across my other external drives. My interpretation of it is even though System info isn't showing full processor usage across all the cores the iMac is still having problems doing everything at the same time and this manifests itself as SS not being able to keep up with writing the video files - there is enough network bandwidth for it to do so but the processor cannot do everything it is being asked to do so prioritises somethings over others.
My interpretation is probably wrong but I have never had this issue when my Mac isn't 'busy', and can almost force it to happen. I am not sure what the Mac is doing sometimes, particularly when it happens in the night.
I reformatted the drive and have set to drive to save 70gb before deleting files. Maybe it was an issue where it was trying to write and delete old files simultaneously.
I will report back when the drive fills up in a little over a week.
Ran the File Writing Test app for 31 cameras and only get 36MB/s. The Drobo is full of 7200 RPM drives, configured for dual-drive failure survivability. I should be getting at least double that, shouldn't I?
Liking Drobos less and less the more I use them. I think allocating some space on my Synology NAS (RS1912+) might be faster, even over ethernet.
Seems that when the files are being overwrites or deleted is when the error occurs.
Since the speeds of the drive, this should not be an issue. I’m stumped.
As a test, could you all try turning off the auto-delete options and see if this prevents the errors? You may like to first clear sufficient space from the drive so as not to run out of space.
So far no errors. I will have to wait another 5 days of record before the auto delete process takes place.
I have a 2tb SSD. I had the delete process to take place with 40gb space left.
I have several cameras that record daily files of 60gb.
Maybe that is the issue. I increased the delete process to start when 80gb space left now.
There are actually no interruptions in the records, the only reason I notice the errors are from the daily reports.
I really don’t think the disk is too slow, I have results of +300MB/s when I use the speedtest software you recommended.
The error is for all the 9 cameras, all for the same second, yet I don’t see a frameloss in the records.
There is usually 1 error for each camera per day.
Have you excluded this disk for Spotlight indexing? (System Preferences -> Spotlight -> Privacy).
It’s strange that the error is so sporadic - if it were due to SecuritySpy’s auto-deletion routines it should happen more often, as this runs every few minutes.
300 MB/s is definitely fast, but there must be something that is causing a temporary short-term slowdown that is reducing this speed drastically, resulting in this error.
When the error happens, the cameras will be temporarily disarmed, so there will be some loss of recording for the short time that the cameras are unarmed. They will then automatically re-arm based on the current schedules set for them. So while there is some loss, it should be very short.
Jlbrown, are you on High Sierra?
I will check on spotlight and time machine. All cameras are on a 7/24 schedule - once marked offline, they don't seem to come back. Was there a timestamp in the alert box that pops up? I don't recall seeing one.
As for the errors, the log file contains time stamps (File menu -> Open Log).
You can also check disk pressure with the new Dashboard feature (available from the Window menu) - 100% disk pressure means that all disk writing buffers are full and this error will be generated. In normal usage with a fast disk, the pressure should remain well under 10%. What does your disk pressure graph look like?
Ben - didn't know about the Dashboard and Disk Pressure. Mine is usually 0%, occasionally goes as high as 5%. When I was the 'disk too slow' errors yesterday, disk pressure was 1%.
File Writing Test gives 52.96 MB/s.
Also, would you mind explaining smoothing? It drastically adjusts the visualized data. For example, if I view Disk Pressure at the time of the logged error with smoothing off I see a blip for a minute or so at 40% but if I move smoothing up, that same blip then reads 1%.
Unrelated, in Dashboard when viewing packet loss, if I check my crappy wifi camera that constantly drops packets, it reports 650% packet loss. What on earth can that possibly mean? :-)
I'm happy to troubleshoot if you've got any ideas. Please let me know if I should just make this a support case instead. Thanks for everything.
I don't seem to know how to post a snapshot of the dashboard to the forum...
So what I've done is to double the size of the disk writing buffer, in the latest beta version of SecuritySpy (currently 5.0.2b8). Please can you all test this and let me know if this reduces the incidence of this error or not.
However, I'm wary about increasing the disk buffer too much, as this could potentially use a lot of RAM.
In general, for best performance, the disk should be:
- An SSD or fast HDD (NOT a fusion drive)
- Not the system drive
- Connected via a fast connection (Internal, Thunderbolt, USB 3)
- Not used for other purposes