Investigating cluster size in relation to card performance

Started by Shield, May 29, 2013, 12:51:06 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

KMikhail

Interesting thread, here's my 2cents:

For simplicity lets consider CF/SSD in a very straightforward way: linear space of multi-kbyte pages. When any information has to be read/written the whole page is read/written. The writing operation is pretty costly time wise and wear wise. This is why we observe:

1) when writing in small blocks every block actually takes the same time to get written as it would be a page sized. That's why writing speed grows so fast with doubling block size till it saturates.
2) when the block to be written crosses two pages - both pages will have to be re-written, that's why it slows down gains of speed before the saturation, otherwise it would be 2x 2x 2x 1x...

Thus, when we flush really big buffer, we actually mitigate unaligned first and last blocks (only partly written pages). However, if controller isn't smart enough every attempt to write inner block crossing the boundary of the page it will result in a double writing. It is debatable, how it is actually processed at this point.

What we want is to align BOTH clusters to boundary of pages AND set the cluster's size to the  page size. Page size is, in fact, pretty large, definitely bigger than what we are used to set on hdds/ssds. North of 16kb.

I don't know if they should be aligned to 0, but if we're talking about guess work - chances to find proper alignment with something like 1024 bytes are small.

This is why we see a performance spread over 3 parameters: cluster's size, cluster's bias, buffer's size - they all interplay.

What would be even more interesting is if ML would have a destructive write test, which will perform writing test across these three parameters, not just the last.

My guess, is that the writing speed from manufacturer's datasheets is basically (page size) / (writing cycle required to re-write it). May be perfectly formatted and buffered CF can get us closer to these values?

Point #2: Tests with SSD having partitioned only to 75% show higher performance. Sometimes SIGNIFICANTLY higher performance - getting partition smaller may resolve the issue of the speed in the beginning of the writing and when CF is getting close to get full (I heard about this).


Audionut

Quote from: Shield on May 29, 2013, 04:47:55 PM
I find it funny that "Audionut" doesn't do audio.

Good to see there's a sense of humor here  ;)
I do more photo then video so haven't really tried that hard to look at the issue.  Having said that though.

Quote from: Shield on May 29, 2013, 04:47:55 PM
Go into the camera menu (not ML) and you'll see that audio has probably been disabled.  Happened to me when I loaded the last ML build; now the wav files actually have content.
Shawn

I'll go and hide in a corner now.  Because that was an easy fix :)  Thanks.

Audionut

Quote from: KMikhail on May 30, 2013, 12:26:50 AM
Point #2: Tests with SSD having partitioned only to 75% show higher performance. Sometimes SIGNIFICANTLY higher performance - getting partition smaller may resolve the issue of the speed in the beginning of the writing and when CF is getting close to get full (I heard about this).

I was aware of the longevity of SSD's with smaller partitions due to wear leveling.  And I've also seen countless times the recommendations of not filling SSD's to capacity, as their performance reduces dramatically when close to full.

But I can't recall the performance increasing from simply having the partition at 75% or so.  If that is what you were implying.

Shield

Quote from: Audionut on May 30, 2013, 01:20:24 AM
Good to see there's a sense of humor here  ;)
I do more photo then video so haven't really tried that hard to look at the issue.  Having said that though.

I'll go and hide in a corner now.  Because that was an easy fix :)  Thanks.

I can die happy now; I've actually contributed to helping someone on these boards.  The whole idea of a group of people coming together with various skill sets is what makes the ML project so interesting.  I'd like to think I take the "It ain't got no gas in it" (Karl from Slingblade) approach, but every little bit helps.  :)

noisyboy

Quote from: Shield on May 30, 2013, 03:50:12 AM
I can die happy now; I've actually contributed to helping someone on these boards.  The whole idea of a group of people coming together with various skill sets is what makes the ML project so interesting.  I'd like to think I take the "It ain't got no gas in it" (Karl from Slingblade) approach, but every little bit helps.  :)

Rest assured you helped me too! Well... Not JUST yet but you will have done once I get around to having to do an edit of my last shoot and having to copy and paste those footers ;)

Keep it up bro  8)

KMikhail

Quote from: Audionut on May 30, 2013, 01:36:34 AM
I was aware of the longevity of SSD's with smaller partitions due to wear leveling.  And I've also seen countless times the recommendations of not filling SSD's to capacity, as their performance reduces dramatically when close to full.

But I can't recall the performance increasing from simply having the partition at 75% or so.  If that is what you were implying.

http://www.anandtech.com/show/6884/crucial-micron-m500-review-960gb-480gb-240gb-120gb/3

Click around tables.
Cheers.

Audionut

Quote from: KMikhail on May 30, 2013, 05:36:22 AM
Click around tables.
Cheers.

QuoteTo generate the data below I took a freshly secure erased SSD and filled it with sequential data. This ensures that all user accessible LBAs have data associated with them. Next I kicked off a 4KB random write workload across all LBAs at a queue depth of 32 using incompressible data.................I recorded instantaneous IOPS every second for the duration of the test. I then plotted IOPS vs. time and generated the scatter plots below.........................If you want to replicate this on your own all you need to do is create a partition smaller than the total capacity of the drive and leave the remaining space unused to simulate a larger amount of spare area.

From what I can gather, he's running the tests on full SSD's, and then again with 25% unpartitioned to simulate free space.

Quote from: Audionut on May 30, 2013, 01:36:34 AMAnd I've also seen countless times the recommendations of not filling SSD's to capacity, as their performance reduces dramatically when close to full.

But I can't recall the performance increasing from simply having the partition at 75% or so.

So, he's not getting performance from just having a partition of 75% of capacity.  He's getting performance increase in those graphs because he is otherwise filling the SSD's to capacity while having some unpartitioned space there for wear leveling.

KMikhail

Quote from: Audionut on May 30, 2013, 06:05:32 AM
From what I can gather, he's running the tests on full SSD's, and then again with 25% unpartitioned to simulate free space.

So, he's not getting performance from just having a partition of 75% of capacity.  He's getting performance increase in those graphs because he is otherwise filling the SSD's to capacity while having some unpartitioned space there for wear leveling.

...and here is the catch - as you have heard some cards drop their performance significantly and it's not necessarily recoverable. TRIM is available on SSD with advanced controllers and I am not sure about CF cards. He ran tests for a fair amount of time/data read and written. Plus, frames get dropped closer to the moment when the card is full anyways. Garbage collection takes a pretty long time and again - on SSDs.

Overall it is a food for thought, nothing solid as we don't know for sure what's going in the CF's brains.

Audionut

Quote from: KMikhail on May 30, 2013, 06:27:53 AM
Overall it is a food for thought, nothing solid as we don't know for sure what's going in the CF's brains.

Indeed.  I haven't looked, but are CF cards using the same type of nand chips?  Perhaps CF doesn't need the same amount of wear leveling and love that SSD's require for their speed and longevity.

In the couple of quick tests I've done, performance only seems to drop when very close to the card capacity, which sort of negates the advantage of leaving an extended unpartitioned space.

The biggest factor though IMHO, is that average CF size is already small.

Naito

Quote from: noix222 on May 29, 2013, 05:15:59 AM
any idea how can i format my card like that on a mac ? thanks guys.

In Terminal:
sudo newfs_msdos -b CLUSTERSIZE /dev/rdisk#s#

i.e.
sudo newfs_msdos -b 32768 /dev/rdisk3s1
to format disk3 partition 1 with 32k cluster sizes

find your disk/partition numbers with
diskutil list

Shield

Quote from: Shield on May 29, 2013, 08:12:25 AM
Footer was AUTO saved!  17467 dng files to process.

THANK YOU MAGIC LANTERN! :)

Hmmm...I knew this happened in an earlier build.  Shot the same scenario tonight (full 64 "take) and the footer wasn't auto-saved.  Wonder what has changed?

xNiNELiVES

So the best cluster size is what? I have a 32 gb komputerbay 1000x...

Also is it really better to align the partition of the card? If so what percentage should be used?

1%

CF - I got best results for largest cluster possible (fat32) and 4096 alignment. For SD cards, SD formatter did an OK job. For making fat32 cards -> exfat I think 128k clusters worked well. Too large and it slowed down, I think its 128k from the factory for cards over 32g.


xNiNELiVES


aaphotog

I am using a 16gb sandisk cf 600x
and a 64gb komputerbay 1000x cf
What is the best cluster size, and how do I formate these cards exfat with the right cluster size?
Im on a mac, and as of now I've just formatted exfat with 'disk utility'