UHS-I / SD cards investigation

Started by nikfreak, July 30, 2014, 05:46:56 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

nikfreak

just wanted to add that by downclocking (for e.g. EOS 6D is 96MHz and could be downclocked to 12/  24 / 48MHz) we might achieve ofc less write speed but this may become useful for cases where continuous write speed is not much important (time lapses?). Advantage is less power consumption in theory... According to the sdcard specifications this should happen if some old card (mmc in formware) is used. Should run at max. 48MHz automatically.
[size=8pt]70D.112 & 100D.101[/size]

nikfreak

UHS-I capable cameras need to be switched to UHS104 & SDR104 (bus speed 104MB) by using CMD6 function group 1.
I found some CMD6 commands as stated earlier in ARMu. maybe someone can help to accomplish setting my 6D or any other UHS-I capable cam to the new bus speed by a CMD6 code switch?
[size=8pt]70D.112 & 100D.101[/size]

nikfreak

Ok need definitely help from one of you ML dev gurus on this. I spent again some hours to read through all official SD specification PDFs from V1.10 til 4.X and I can say CMD6 is all that is needed to set bus speed modes (SDR50 SDR104 DDR50 ...). There's one CMD6 SwitchCommand available in my 6D rom so it may be worth a try. Ofc this only works for UHS-I cards. I also simply guess that 5D3 is set to SDR25. a1ex tried once (see 1st page) but if something is done wrong then sd spec says that afterwards settings are defaulted and that's then SDR12. 

[size=8pt]70D.112 & 100D.101[/size]

Levas

Wish I could help you...but have no idea how.

"There's one CMD6 SwitchCommand available in my 6D rom"

Sounds promising...

nikfreak

Quote from: a1ex on August 04, 2014, 08:36:18 AM
....for example, only log those messages from CSMgrTask.

Can someone give me a detailed instruction based on dm-spy-experiments on how to log only "CSMgrTask" + "SdioDrv" +"SdioTsk". I bought a faster Sandisk card which is rated 60MB/s in write speed (before this I had panasonic Gold with a rating of max 45MB/s.). I wanna compare the logs I get from both cards and yes I am still trying to find a way where buss speed modes are set. Googling around I found this here which contains stuff like

#define MMC_CAP_UHS_SDR12       (1 << 15)       /* Host supports UHS SDR12 mode */

        #define MMC_CAP_UHS_SDR25       (1 << 16)       /* Host supports UHS SDR25 mode */

        #define MMC_CAP_UHS_SDR50       (1 << 17)       /* Host supports UHS SDR50 mode */

        #define MMC_CAP_UHS_SDR104      (1 << 18)       /* Host supports UHS SDR104 mode */

        #define MMC_CAP_UHS_DDR50       (1 << 19)       /* Host supports UHS DDR50 mode */


and maybe SdioDrv can show what's going on in 6D logs. So some hints where i can define what is going to be logged would be helpful.
[size=8pt]70D.112 & 100D.101[/size]

nikfreak

Please have a look at this example sub / stub. Looks like I understood myself what I can achieve with dm-spy-experiments but plz at least give me a hint if that would be ok:



Now I take /src/dm-spy-extra.c and in there I would add like this:


#ifdef CONFIG_6D
    { 0xyzblabla, "StateTransition", 4 , state_transition_log },
    { 0xFF78D814, "SDCheckStatus", WhatToPutHere??? }
#endif


Would that work for startup / "debug (don't click me?) and log some stuff related to SDCheckStatus or doesn't this make any sense to identify what's going on for other stubs?
[size=8pt]70D.112 & 100D.101[/size]

kuga0509

Any luck on this?  Before I even read this thread, the only logical limiting factor I could find seemed to be that the software was limiting the hardware.  Since the hardware should support the SDR104 standard, and since the voltage and all of the other requirements would remain the same, it really seems that simple.  I hope this works out.  :)

The only other solution would be to load a different h264 profile to change the color space.  I know that isn't raw, but it may at least get the desired bit depth.  However, that would still require this same issue to be resolved since it would likely require a higher bus speed...

a1ex

Revisiting this.

At the suggestion of Ant123, I've ran ML under the Xilinx version of QEMU, which also emulates UHS. This revealed the missing bits from my previous patch: on 5D3, although it was printing a message about 96MHz, the card interface was set up in the same way as for... 24 MHz (which explains the previous benchmark results).

Took the missing register configurations from 700D and...



8)




Now the details.

Xilinx QEMU includes UHS emulation (along with some other nice stuff) and is based on QEMU 2.6.x (at the time of writing). Currently we have patches for QEMU 2.5.0 and 2.9.0, and for the Xilinx version, it will be something in-between. If there is interest, I can commit the patches.

First, I've ran 5D3 1.2.3 from the dm-spy-experiments branch on both the camera and QEMU 2.5.0. Result.

The problem:

CSMgrTask:ff6bdb7c:23:05: Set Hi-Speed Mode( 48MHz )                ; real hardware
[   CSMgrTask:ff6bdb9c ] (23:05) Set Normal-Speed Mode( 24MHz )     ; QEMU 2.5.0


That's where the Xilinx QEMU comes in. Result: after a minor patch to the SCR structure (first field - SpecVersion - changed from 0 to 1), both the log from camera and Xilinx QEMU now have:

CSMgrTask:ff6bdb44:23:05: SD_GetAccessMode=3
CSMgrTask:ff6bdb7c:23:05: Set Hi-Speed Mode( 48MHz )


Also, besides a minor difference in handling CMD1, and the card capacity, the emulation matches the reality pretty well. There's even some nice debug info if you uncomment DEBUG_SD in hw/sd/sd.c and use the "-d sd" switch on the command line. Result - in particular, these lines:

CSMgrTask:ff6b8fec:23:01: sdSetFunction( 16776961 ) Start
CSMgrTask:000aea50:00:00: *** sd_setup_mode(0x2), from ff484738
sd: CMD16 0x00000040 state 4
sd: Response: 00 00 09 00 state 4
CSMgrTask:000aea50:00:00: *** sd_setup_mode(0x2), from ff6b8300
sd: CMD6 0x80ffff01 state 4
sd: Function default selected (fn grp 2)
sd: Function high-speed/SDR25 selected (fn grp 1)
sd: Response: 00 00 09 00 state 5
sd: CMD13 0x45670000 state 4
sd: Response: 00 00 09 00 state 4
sd: CMD16 0x00000200 state 4
sd: Response: 00 00 09 00 state 4
CSMgrTask:ff6b91c8:23:01: sdSetFunction( 16776961 ) End


Now we can analyze how the SD initialization code configures the hardware (MMIO registers). This is as easy as specifying "-d io" in QEMU's command-line. Result.

At this point I've applied the old patch and tried to figure out whether there's any obvious trouble. Result:

Without hack:

CSMgrTask:ff6bdb7c:23:05: Set Hi-Speed Mode( 48MHz )
CSMgrTask:000aeed0:00:00: *** sd_set_mode(0x1, 0x3), from ff6bdb88
(some registers)
CSMgrTask:000aeed0:00:00: *** sd_setup_mode(0x3), from ff484738
(some more registers)
CSMgrTask:ff6bc87c:23:01: sdReadBlk: st=0, num=1, buf=0x40983808


With hack:

CSMgrTask:ff6bdb5c:23:05: Set Hi-Speed Mode( 96MHz )
CSMgrTask:000aeef0:00:00: *** sd_set_mode(0x1, 0x4), from ff6bdb88
(some registers)
CSMgrTask:000aeef0:00:00: *** sd_setup_mode(0x4), from ff484738
(a few more registers, but many of them missing!)
CSMgrTask:ff6bc87c:23:01: sdReadBlk: st=0, num=1, buf=0x409837bc


In other words, sd_setup_mode(4) appears to skip some hardware configuration it might be supposed to perform. With other arguments, it sets a bunch of registers, and the argument appears to be related to SD clock speed (3 = 48MHz, 2 = 24MHz, 4 = 96MHz).

Let's see what it does on 700D. Result (also with a zoomed-in view).

Notice some additional registers on 700D (highlighted in red on the zoomed-in view). What's up with them?

The 0xC04004xx range is never set on 5D3, so these registers are probably specific to 700D hardware. Same for 0xC040063x and 0xC040064x. I didn't touch them.

The remaining registers, 0xC04006[012]x, are also set on 5D3, at other speed modes. In these other modes, their values are the same on 700D. You can see them here: sd_setup_mode(2), sd_setup_mode(4).

My hypothesis was that 5D3's SD controller is UHS-capable, but for some unknown reason (could be even problems during the initial tests), Canon decided not to include it in the firmware. As a result, some of the UHS initialization code (hopefully a small part) was optimized out.

So I've tried to take the missing register configurations from 700D.

Therefore, the patch for 5D3 1.2.3 becomes:

/* in dm-spy.c, right before dm_spy_extra_install() */
patch_instruction(0xff48446c, 0xe3a00000, 0xe3a00001, "SD 1.8V");

/* in dm-spy-extra.c */
static void sd_setup_mode_log(uint32_t* regs, uint32_t* stack, uint32_t pc)
{
    /* log the original call as usual */
    generic_log(regs, stack, pc);

    if (regs[0] == 4)
    {
        MEM(0xC0400600) = 3;
        MEM(0xC0400610) = 4;
        MEM(0xC0400614) = 0x1D000301;
        MEM(0xC0400618) = 0;
        MEM(0xC0400624) = 0x201;
        MEM(0xC0400628) = 0x201;
        MEM(0xC040061C) = 0x100;
        MEM(0xC0400620) = 4;
        MEM(0xC0400604) = 3;
    }
}

    /* under CONFIG_5D3_123 */
    { 0xFF4844A0, "sd_setup_mode", 1, sd_setup_mode_log },


If you decide to try it, make sure you don't have any important data on your card. Otherwise, you will be playing Russian Roulette with your data (just like with my other SD patch).

Does this apply to DIGIC 4 cameras?

I'm afraid not - the hardware configuration of these cameras is different (and a lot simpler). You now know where to look, so you can play with it, attempt to change the clock speed and report your findings.

Can the clock speed be pushed even further?

I have no idea. Feel free to play with these registers, run the benchmarks and report.

Can this be included in a module, to be used on a regular ML build?

That's hard, because the hack must be applied before the SD card gets initialized by Canon firmware (in other words, before loading any module, and also before loading the config file). So, even if we include it in ML core, it will be hard to create an option for it.

At the moment, the easiest way to try it would be a custom build. Probably best to start from the crop_rec_4k branch, as the backend support is there, and there is little reason to try this hack outside that branch.

It might be possible to switch the SD to a higher speed on the fly. Didn't investigate this approach.

How's this useful in practice?

Other than raw video recording on both CF and SD at the same time (aka card spanning), it's probably not very useful.

What's the maximum total speed (CF+SD)?

Load the benchmarks module (bench.mo) to find out.

How can I get similar results in QEMU?

Take a look at this post. In a nutshell, it's the dm-spy-experiments branch compiled with CONFIG_DEBUG_INTERCEPT_STARTUP=y for the camera, and additionally with CONFIG_QEMU=y for running it under the emulator. These two will give logs that can be directly compared, and with the logging options from QEMU, you can get additional details. Then, look for CSMgrTask in the logs, compare them and try to understand what it does. Also refer to SD docs (summarized nicely by nikfreak earlier in this thread) to understand the initialization protocol. That's it. If you get stuck, just ask (here or on IRC).

If there is interest, I can commit the patch for Xilinx QEMU and write a walkthrough similar to this one.

Please note I no longer have the UHS card to do more tests (it wasn't mine), so from now on, what will happen with this hack is entirely up to you.

aschille84

Wow, so this means we can get another 30mb/s write speed. I would like to try it out.

Levas

Nice finding.

One question about this one:

Can the clock speed be pushed even further?

I have no idea. Feel free to play with these registers, run the benchmarks and report.


How is this in the other cams, like the 6d. I've always thought that UHS-I, 50 MByte/s (SDR50) was the limit here ?
Or is there still a small chance that it can reach UHS-I, 104 MByte/s (SDR104) ?

Ant123

Quote from: a1ex on June 18, 2017, 11:54:16 PM
My hypothesis was that 5D3's SD controller is UHS-capable, but for some unknown reason (could be even problems during the initial tests), Canon decided not to include it in the firmware. As a result, some of the UHS initialization code (hopefully a small part) was optimized out.

If you can not enter UHS mode with ML but it works without, read this topic.
Conclusion:
Most SD cards need to be reinitialized by switching off SD power if they already were in UHS mode.

a1ex

Quote from: a1ex on June 18, 2017, 11:54:16 PM
Can the clock speed be pushed even further?

Let's try some overclocking: sd_uhs.mo (to be loaded on top of crop_rec_4k branch). Got the idea after looking into the EOS M shutter bug and understanding how clocks work - to some extent. See the source below for RE notes.

Before and after. This is a slower card, compared to previous post.



Source: module_hginfo_dump.sh sd_uhs.mo

5D3 only for now, tested on 1.1.3 with 2 UHS cards and 2 regular ones.

Other D5 models on the todo list. Do not try pattern-matching the stubs for other models - it won't work. Only SD_ReConfiguration is generic code, the rest is 5D3-specific.

Levas

 :o

Is this the real april first surprise  8)

Levas

Do you think there's room left in the cameras that already have about 40Mb writing speed, like the, random pick, canon 6d  :P

theBilalFakhouri

No way man! Wow! Is this Day of Surprises movie? :D
This is very large hope for doing continues 3K or just continues Slow mo on small cameras!

And is there benefit by merging both SD + CF writes speeds for continues UHD and 4K  for 5D3 ?

nikfreak

So this is SDR50 at how much speed? Guess beyond 96MHz?
[size=8pt]70D.112 & 100D.101[/size]

a1ex


module_hginfo_dump.sh sd_uhs.mo



static uint32_t regs[] = { 0xC0400600, 0xC0400604,/*C0400608, C040060C*/0xC0400610, 0xC0400614, 0xC0400618, 0xC0400624, 0xC0400628, 0xC040061C, 0xC0400620 };   /* register addresses */
static uint32_t sd50[] = {        0x3,        0x3,                             0x4, 0x1D000301,        0x0,      0x201,      0x201,      0x100,        0x4 };   /* SDR50 values from 700D */
static uint32_t unck[] = {        0x3,        0x3,                             0x5, 0x1D000401,        0x0,      0x201,      0x201,      0x100,        0x5 };   /* underclocked values */
static uint32_t ovck[] = {        0x3,        0x3,                             0x3, 0x1D000201,        0x0,      0x201,      0x201,      0x100,        0x3 };   /* overclocked values */
static uint32_t twak[] = {        0xF,        0xF,                             0xF, 0x00000F00,        0x0,      0xF0F,      0xF0F,        0x0,        0xF };   /* what can be tweaked? */
// mode 0                                                                      17F  0x1D004101           0        403F        403F          7F          7F
// mode 1 16MHz                     3           3          1          1         1D  0x1D001001           0       0xF0E       0xF0E          1D          1D
// mode 2 24MHz                     3           3          1          1         13  0x1D000B01           0       0xA09       0xA09          13          13
// mode 3 48MHz                     3           3          1          1          9  0x1D000601           0       0x504       0x504       0x100           9
// mode 4 96MHz SDR50 700D          3           3          1          1          4  0x1D000301           0       0x201       0x201       0x100           4
// mode 5 serial flash?             3                                            7  0x1D000501           0       0x403       0x403       0x403           7
// mode 6                           7                                           13  0x1D000B01           0       0xA09       0xA09          13          13


Look at 0xC0400610 and 0xC0400620:

16 vs 24 MHz => 0x13 vs 0x1D. Exact ratio match with 0x14 vs 0x1E (setting registers to X-1 is a common trick used in Canon hardware).
24 vs 48 MHz => 0x9 vs 0x13. Exact match with 0x10 vs 0x20.
48 vs 96 MHz => 0x4 vs 0x9. Exact match with 0x5 vs 0x10.
Overclocked: 0x4 vs 0x3 => 125% theoretical speedup.
Benchmark result (read column): 54.4/43.1 = 126%. Check!

=> SDR50 @ 120 MHz.

0xC0400624 hi: exact match at 16/24 and 24/48, rounded at 48/96. 5/2/1.25 = 2 => exact match for the overclocked version?
0xC040061C ?!

Underclocking test: 0x4 vs 0x5 => 83.33% speed => 35.9MB/s expected (matches my benchmarks).

Porting notes will follow. Basic idea: place a logging hook right after these registers are set, to be able to override them. 5D3 does not configure these registers in  UHS mode (likely optimized out in Canon firmware), which is why pattern-matching the stub won't work on other models. Registers are refreshed on every sdReadBlk/sdWriteBlk. SD_ReConfiguration will reset the card, including power cycling. My call to that function is not thread safe - do not run the overclocking tests while other tasks are accessing the card. Debug with qemu -d io,sdcf (or xilinx-qemu, but I need to cleanup and publish the patch for that).

nikfreak

woot.
I always thought SDR50 was limited by host to 100MHz according to the screenshots I posted on page1.
120MhZ rather sounds like we have SDR104 - at least if canon followed official specs. Anyways as always great job. I encourage 5D3 owners to test stability and report their findings - ofc everyone wanna have this. If you got a unique way to handle / port this for Digic 5 cameras then don't hesitate to post it @a1ex.
[size=8pt]70D.112 & 100D.101[/size]

a1ex

Actually, one of my UHS cards (a slow one, that writes at 15MB/s after the hack) refuses the overclocked settings, but handles the regular SDR50 and the underclocked one.

I did try to change the function to SDR104, but did not make a difference with any of my cards. 5D3 1.1.3 dm-spy-experiments:


static void sd_set_function_log(uint32_t* regs, uint32_t* stack, uint32_t pc)
{
    /* log the original call as usual */
    generic_log(regs, stack, pc);

    /* force UHS-I SDR104 */
    regs[0] = 0xff0003;
}


    { 0xFF6ADE34, "sdSetFunction", 1, sd_set_function_log },

nikfreak

Yay tricky one.
Ant123 states to switch power off if already in UHS mode. Maybe he can comment on it. Just a guess but once we find out how to reinitialize the card it could probably also help with the eosm shutter bug - while extending boot times due to reinitialization.

In 2014 while googling I didn't find host controllers which only support one of both modes.
[size=8pt]70D.112 & 100D.101[/size]

a1ex

OK, I was wrong. Applied the SDR104 hack again and got another speed improvement!




static void sd_set_function_log(uint32_t* arm_regs, uint32_t* stack, uint32_t pc)
{
    /* UHS-I SDR50? */
    if (arm_regs[0] == 0xff0002)
    {
        /* force UHS-I SDR104 */
        arm_regs[0] = 0xff0003;
    }
}


Module updated (same link).

Levas

Nice one  :D
Does the SD card you're using for this claim a read speed, like the sandisks, 45MB/s, 80Mb/s, 90Mb or 95Mb/s ?

a1ex

It doesn't claim anything special, but noticed it can do about 70MB/s in the card reader, using dd.


sudo dd if=/dev/mmcblk0 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 14.5176 s, 74.0 MB/s


No luck with 160MHz yet.

IDA_ML

A1ex,

Is it possible to implement card spanning with the losslessly compressed 4K-crop recording modes on the 5D3?  That would be a break through for this camera, I think, since it will provide much larger recording times at high resolutions.

Ant123

Quote from: nikfreak on April 02, 2018, 09:58:28 AM
Yay tricky one.
Ant123 states to switch power off if already in UHS mode. Maybe he can comment on it.

What can I comment? Just read CHDK forum from here.
On powershots the card is already in UHS mode while loading CHDK. So most cards can't be switched to UHS mode two times.
Maybe on DSLRs the card is not in UHS mode while loading autoexec.bin and you don't need to turn off SD power.