Investigating cluster size in relation to card performance

Started by Shield, May 29, 2013, 12:51:06 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Shield

Quick background - in the last week I've purchased three 64GB 1000x KomputerBay cards.
First one didn't work in the camera, but was fine in the computer.
Second one is fine and works as expected.
Third one works but is far slower, and only does around 60MB/s.

In my external quest for fast enough cards I am going to run various tests with benchmark results, and show the effect (if any) on the cluster size of the formatted cards.  One would think a large cluster size would be fine since it's only going to store very large data files, even if not super efficient for small files.

Example -

Card B:

Formatted for 16384 cluster size:

1920x1080 = 73, 73, 91 frames before crash
1880x1058 = 73, 121, 127 frames before crash
1720x968 = ran indefinitely

I will post screen captures too of known "good" cards, like my 32GB Lexar and the sole remaining Komputerbay that works.

So, it's really a crapshoot with the Komputerbay cards if you want 1920x1080 @ 24p.

Shield

I will also make the test somewhat scientific and do the same steps each time.  For example, pull the battery between tests, same ML build (May 27) and same ML settings for each (Globaldraw ON).

If anything we might find out that a specific cluster size can get someone over the hump for their desired resolution.  Plus, I'm ticked that 2 out of 3 cards were not as advertised.
Shawn

Audionut

I've got a 16GB Lexar 1000x CF.

It was formatted for 512kb cluster size as I wanted to not waste space with photos.  I was getting around 70 frames @ 1920x1080 before skipping.

I formatted to 16384 cluster size and got 660 frames before frame skipping.

Note:  The 16GB Lexar's are not as fast as the 36GB versions.

Reducing the aspect ratio to 1.85:1 is just enough to have continuous recording :)

Looking forward to your results Shield.


noix222

any idea how can i format my card like that on a mac ? thanks guys.

Shield

I will post all the numbers here shortly.  This is boring as hell, but I'm too far into it to stop.
Keep in mind I'm running these tests on the "slow" 1000x Komputerbay card.

Here's some of the anecdotal results so far:

(3 tests each per resolution; test is until card skipped a frame)
Formatted for 4096 bytes cluster size:
1920x1080 = 100, 136, 100 frames.
1880x1058 = 154, 172, 163 frames
1720x968 = ran indefinitely

Formatted for 8192 bytes cluster size:
1920x1080 = 100, 63, 99 frames.
1880x1058 = 163,100, 172 frames
1720x968 = ran indefinitely

Formatted for 16 kilobytes cluster size:
1920x1080 = 45, 126, 136 frames.
1880x1058 =  117, 154, 163 frames
1720x968 = ran indefinitely

Formatted for 32 kilobytes cluster size:
1920x1080 = 46, 136, 145 frames.
1880x1058 = 172, 163, 172 frames
1720x968 = ran indefinitely

Formatted for 64 kilobytes cluster size:
1920x1080 =  145, 145, 109 frames.
1880x1058 =  172, 172, 181 frames
1720x968 = ran indefinitely

Formatted for 128 kilobytes cluster size:
1920x1080 =  109 136 100 frames.
1880x1058 = 172,172,172 frames
1720x968 = ran indefinitely

Formatted for 256 kilobytes cluster size:
1920x1080 =  136, 136, 100 frames.
1880x1058 =  163, 163, 162 frames
1720x968 = ran indefinitely

Shield

When these tests are done, I'm going to record perhaps the longest raw 1920x1080 5d3 movie ever done - a 63 GB one.  If the footer doesn't save I'm going to have to figure it out for the 64 GB size, but I have all the other footer sizes covered in my other thread.

http://www.magiclantern.fm/forum/index.php?topic=5732.msg41171#msg41171


Shield

Well I've saved all the screenshots and was going to get creative and do something in Excel.

But, the results are pretty much the same.  At least in Windows, with a 64GB card, use the default allocation size (128 kilobytes).

I will say performance got worse the larger the cluster size beginning @ 2048 kilobytes.

What a waste of an evening. :P

Shield

For a point of reference, here's the Lexar 32GB vs the "good" Komputerbay 64GB (both 1000x) :

(I realized I had done the test on the Komputerbay @ 5x zoom mode, but I'm very sick of testing and will not re-do it);

Lexar 32 GB:




Komputerbay 64GB card:


Shield

Quote from: Shield on May 29, 2013, 05:57:16 AM
When these tests are done, I'm going to record perhaps the longest raw 1920x1080 5d3 movie ever done - a 63 GB one.  If the footer doesn't save I'm going to have to figure it out for the 64 GB size, but I have all the other footer sizes covered in my other thread.

http://www.magiclantern.fm/forum/index.php?topic=5732.msg41171#msg41171

Well the file size of the most boring movie of my desk and phone have been recorded.  1920x1080, 5d3.  Filesize is 62,467,329 KB, aka 62.4 GB.
Even with USB3 this is taking about 8 minutes to transfer over to a very fast disk array.
My guess is about 12:30 of footage.  Also, it shut off by itself; I'm curious if there's now a routine that auto-saves the footer.  I noticed in today's build (May 28th P.M. Lorenco's) that there's no longer an option to "enable over 4GB".  It just worked.  Excited to see how long the dng extraction will take on a 3.4GHZ overclocked i7 920....

Shield

Footer was AUTO saved!  17467 dng files to process.

THANK YOU MAGIC LANTERN! :)

N/A

Imax cameras only hold 3 minutes of 70mm film and take ~20 minutes to unload, so it could be worse  ;D
7D. 600D. Rokinon 35 cine. Sigma 30 1.4
Audio and video recording/production, Random Photography
Want to help with the latest development but don't know how to compile?

Audionut

Quote from: Shield on May 29, 2013, 07:16:17 AM
For a point of reference, here's the Lexar 32GB vs the "good" Komputerbay 64GB (both 1000x) :

The speeds are slow for a 32GB Lexar.  Have you aligned the partition?

http://www.sevenforums.com/tutorials/113967-ssd-alignment.html

Shield

Quote from: Audionut on May 29, 2013, 08:52:18 AM
The speeds are slow for a 32GB Lexar.  Have you aligned the partition?

http://www.sevenforums.com/tutorials/113967-ssd-alignment.html

Nope.  It's never let me down on 1920x1080 though.  If it starts too I'll look into it.

Any idea why one of my KomputerBay cards works in a pc but the camera can't read it?

Audionut

Quote from: Shield on May 29, 2013, 09:40:10 AM
Nope.  It's never let me down on 1920x1080 though.  If it starts too I'll look into it.


That is strange.  My card benches slightly faster then yours, but I cannot maintain 1920x1080.



With changes to alignment and cluster size, I can do 1920x1036 though.

Quote from: Shield on May 29, 2013, 09:40:10 AM
Any idea why one of my KomputerBay cards works in a pc but the camera can't read it?

No sorry.  You could try totally deleting the partition table and re-formatting.

squig


squig

Quote from: noix222 on May 29, 2013, 05:15:59 AM
any idea how can i format my card like that on a mac ? thanks guys.

Wish I knew. Probably have to run wine AGAIN!  ::)

Audionut

Quote from: squig on May 29, 2013, 10:12:02 AM
Turn global draw off!

In my case for this card, it made a difference of +-2MB/s.  I always have it off for RAW recording though.

squig

My Toshiba 64gb 1066x benches a couple of MB/s lower than the Lexar but it doesn't drop any 1080p frames.

Shield

Quote from: squig on May 29, 2013, 10:12:02 AM
Turn global draw off!

Now why in the world would I do that, considering I shoot with it on (focus peaking, histogram, zebras?).  It has to pass the "real world tests" for me; I can't shoot FF at larger apertures without some focus peaking.  I had already started the tests anyway with GD on.

Audionut

I think I've nailed it down to the card settings.  I changed from Record func. Auto switch card - to Standard and I can now record 1920x1080.  The buffer was still peaking hard on the third buffer indicator on first record and stayed stuck on the second indicator on a second recording.

Shield

Quote from: Audionut on May 29, 2013, 09:47:13 AM

That is strange.  My card benches slightly faster then yours, but I cannot maintain 1920x1080.



With changes to alignment and cluster size, I can do 1920x1036 though.

No sorry.  You could try totally deleting the partition table and re-formatting.

Tried that still nuttin...

Also I tried your align=1024 trick and my benchmarks got a bit slower.  No idea why.  I've shots hours upon hours @ 1920x1080 with the Lexar and never missed a beat - shot a full 32GB in the hot sun yesterday with focus peaking, histogram, global draw galore.  Perhaps the benchmarks aren't 100% accurate?  Maybe the data is being laid down on the disk in a manner that's not exactly 2048/1953 or any of the other buffer sizes?  Based on your last screenshot, there's just no way you shouldn't be able to do 1920x1080.

Here's another question for you - are you recording audio as a seperate wav?  Stick a generic SD card in the other slot; maybe that's your problem.  It'll automatically record the .wav to the other card.  I cannot do 1920x1080 + audio JUST on the CF either.

Shield

Quote from: Audionut on May 29, 2013, 10:43:54 AM
I think I've nailed it down to the card settings.  I changed from Record func. Auto switch card - to Standard and I can now record 1920x1080.  The buffer was still peaking hard on the third buffer indicator on first record and stayed stuck on the second indicator on a second recording.

Still sounds odd - see my last post.  Here's what mine does- it will do 2 ** for about 2-3 seconds at the beginning of the recording, then drop to a single * and never move.  On both the 64 Kompu-Serve card and the Lexar.  Exact same way.

Audionut

I don't do audio, I get the wav files but every software I own opens them up empty.

Turns out all of my problems are probably from only gauging speed on the first record.  Following recordings go much better.

Shield

Quote from: Audionut on May 29, 2013, 11:02:41 AM
I don't do audio, I get the wav files but every software I own opens them up empty.

Turns out all of my problems are probably from only gauging speed on the first record.  Following recordings go much better.

I find it funny that "Audionut" doesn't do audio.
Go into the camera menu (not ML) and you'll see that audio has probably been disabled.  Happened to me when I loaded the last ML build; now the wav files actually have content.
Shawn

KMikhail

Interesting thread, here's my 2cents:

For simplicity lets consider CF/SSD in a very straightforward way: linear space of multi-kbyte pages. When any information has to be read/written the whole page is read/written. The writing operation is pretty costly time wise and wear wise. This is why we observe:

1) when writing in small blocks every block actually takes the same time to get written as it would be a page sized. That's why writing speed grows so fast with doubling block size till it saturates.
2) when the block to be written crosses two pages - both pages will have to be re-written, that's why it slows down gains of speed before the saturation, otherwise it would be 2x 2x 2x 1x...

Thus, when we flush really big buffer, we actually mitigate unaligned first and last blocks (only partly written pages). However, if controller isn't smart enough every attempt to write inner block crossing the boundary of the page it will result in a double writing. It is debatable, how it is actually processed at this point.

What we want is to align BOTH clusters to boundary of pages AND set the cluster's size to the  page size. Page size is, in fact, pretty large, definitely bigger than what we are used to set on hdds/ssds. North of 16kb.

I don't know if they should be aligned to 0, but if we're talking about guess work - chances to find proper alignment with something like 1024 bytes are small.

This is why we see a performance spread over 3 parameters: cluster's size, cluster's bias, buffer's size - they all interplay.

What would be even more interesting is if ML would have a destructive write test, which will perform writing test across these three parameters, not just the last.

My guess, is that the writing speed from manufacturer's datasheets is basically (page size) / (writing cycle required to re-write it). May be perfectly formatted and buffered CF can get us closer to these values?

Point #2: Tests with SSD having partitioned only to 75% show higher performance. Sometimes SIGNIFICANTLY higher performance - getting partition smaller may resolve the issue of the speed in the beginning of the writing and when CF is getting close to get full (I heard about this).


Audionut

Quote from: Shield on May 29, 2013, 04:47:55 PM
I find it funny that "Audionut" doesn't do audio.

Good to see there's a sense of humor here  ;)
I do more photo then video so haven't really tried that hard to look at the issue.  Having said that though.

Quote from: Shield on May 29, 2013, 04:47:55 PM
Go into the camera menu (not ML) and you'll see that audio has probably been disabled.  Happened to me when I loaded the last ML build; now the wav files actually have content.
Shawn

I'll go and hide in a corner now.  Because that was an easy fix :)  Thanks.

Audionut

Quote from: KMikhail on May 30, 2013, 12:26:50 AM
Point #2: Tests with SSD having partitioned only to 75% show higher performance. Sometimes SIGNIFICANTLY higher performance - getting partition smaller may resolve the issue of the speed in the beginning of the writing and when CF is getting close to get full (I heard about this).

I was aware of the longevity of SSD's with smaller partitions due to wear leveling.  And I've also seen countless times the recommendations of not filling SSD's to capacity, as their performance reduces dramatically when close to full.

But I can't recall the performance increasing from simply having the partition at 75% or so.  If that is what you were implying.

Shield

Quote from: Audionut on May 30, 2013, 01:20:24 AM
Good to see there's a sense of humor here  ;)
I do more photo then video so haven't really tried that hard to look at the issue.  Having said that though.

I'll go and hide in a corner now.  Because that was an easy fix :)  Thanks.

I can die happy now; I've actually contributed to helping someone on these boards.  The whole idea of a group of people coming together with various skill sets is what makes the ML project so interesting.  I'd like to think I take the "It ain't got no gas in it" (Karl from Slingblade) approach, but every little bit helps.  :)

noisyboy

Quote from: Shield on May 30, 2013, 03:50:12 AM
I can die happy now; I've actually contributed to helping someone on these boards.  The whole idea of a group of people coming together with various skill sets is what makes the ML project so interesting.  I'd like to think I take the "It ain't got no gas in it" (Karl from Slingblade) approach, but every little bit helps.  :)

Rest assured you helped me too! Well... Not JUST yet but you will have done once I get around to having to do an edit of my last shoot and having to copy and paste those footers ;)

Keep it up bro  8)

KMikhail

Quote from: Audionut on May 30, 2013, 01:36:34 AM
I was aware of the longevity of SSD's with smaller partitions due to wear leveling.  And I've also seen countless times the recommendations of not filling SSD's to capacity, as their performance reduces dramatically when close to full.

But I can't recall the performance increasing from simply having the partition at 75% or so.  If that is what you were implying.

http://www.anandtech.com/show/6884/crucial-micron-m500-review-960gb-480gb-240gb-120gb/3

Click around tables.
Cheers.

Audionut

Quote from: KMikhail on May 30, 2013, 05:36:22 AM
Click around tables.
Cheers.

QuoteTo generate the data below I took a freshly secure erased SSD and filled it with sequential data. This ensures that all user accessible LBAs have data associated with them. Next I kicked off a 4KB random write workload across all LBAs at a queue depth of 32 using incompressible data.................I recorded instantaneous IOPS every second for the duration of the test. I then plotted IOPS vs. time and generated the scatter plots below.........................If you want to replicate this on your own all you need to do is create a partition smaller than the total capacity of the drive and leave the remaining space unused to simulate a larger amount of spare area.

From what I can gather, he's running the tests on full SSD's, and then again with 25% unpartitioned to simulate free space.

Quote from: Audionut on May 30, 2013, 01:36:34 AMAnd I've also seen countless times the recommendations of not filling SSD's to capacity, as their performance reduces dramatically when close to full.

But I can't recall the performance increasing from simply having the partition at 75% or so.

So, he's not getting performance from just having a partition of 75% of capacity.  He's getting performance increase in those graphs because he is otherwise filling the SSD's to capacity while having some unpartitioned space there for wear leveling.

KMikhail

Quote from: Audionut on May 30, 2013, 06:05:32 AM
From what I can gather, he's running the tests on full SSD's, and then again with 25% unpartitioned to simulate free space.

So, he's not getting performance from just having a partition of 75% of capacity.  He's getting performance increase in those graphs because he is otherwise filling the SSD's to capacity while having some unpartitioned space there for wear leveling.

...and here is the catch - as you have heard some cards drop their performance significantly and it's not necessarily recoverable. TRIM is available on SSD with advanced controllers and I am not sure about CF cards. He ran tests for a fair amount of time/data read and written. Plus, frames get dropped closer to the moment when the card is full anyways. Garbage collection takes a pretty long time and again - on SSDs.

Overall it is a food for thought, nothing solid as we don't know for sure what's going in the CF's brains.

Audionut

Quote from: KMikhail on May 30, 2013, 06:27:53 AM
Overall it is a food for thought, nothing solid as we don't know for sure what's going in the CF's brains.

Indeed.  I haven't looked, but are CF cards using the same type of nand chips?  Perhaps CF doesn't need the same amount of wear leveling and love that SSD's require for their speed and longevity.

In the couple of quick tests I've done, performance only seems to drop when very close to the card capacity, which sort of negates the advantage of leaving an extended unpartitioned space.

The biggest factor though IMHO, is that average CF size is already small.

Naito

Quote from: noix222 on May 29, 2013, 05:15:59 AM
any idea how can i format my card like that on a mac ? thanks guys.

In Terminal:
sudo newfs_msdos -b CLUSTERSIZE /dev/rdisk#s#

i.e.
sudo newfs_msdos -b 32768 /dev/rdisk3s1
to format disk3 partition 1 with 32k cluster sizes

find your disk/partition numbers with
diskutil list

Shield

Quote from: Shield on May 29, 2013, 08:12:25 AM
Footer was AUTO saved!  17467 dng files to process.

THANK YOU MAGIC LANTERN! :)

Hmmm...I knew this happened in an earlier build.  Shot the same scenario tonight (full 64 "take) and the footer wasn't auto-saved.  Wonder what has changed?

xNiNELiVES

So the best cluster size is what? I have a 32 gb komputerbay 1000x...

Also is it really better to align the partition of the card? If so what percentage should be used?

1%

CF - I got best results for largest cluster possible (fat32) and 4096 alignment. For SD cards, SD formatter did an OK job. For making fat32 cards -> exfat I think 128k clusters worked well. Too large and it slowed down, I think its 128k from the factory for cards over 32g.


xNiNELiVES


aaphotog

I am using a 16gb sandisk cf 600x
and a 64gb komputerbay 1000x cf
What is the best cluster size, and how do I formate these cards exfat with the right cluster size?
Im on a mac, and as of now I've just formatted exfat with 'disk utility'