P96 V3.2.4 and mmu.library usage

Caution: Non registered users only see threads and messages in the currently selected language, which is determined by their browser. Please create an account and log in to see all content by default. This is a limitation of the forum software.


Also users that are not logged in can not create new threads. This is a, unfortunately needed, counter measure against spam. Please create an account and log in to start new threads.

Don't Panic. Please wash hands.
  • v3.2.4 comes with this change (among other things):

    Quote

    Removed the manual MMU hack. Any MMU table modifications require now the availability of the mmu.library.

    This is really good; I can finally get rid of my workarounds in the driver!


    But - It seems to me that there is no error check for a missing mmu.library; i.e. if the CARD driver asks for IMPRECISE/NONSERIAL (BIF_CACHEMODECHANGE) it looks like it's silently ignored in that case.


    Also, it seems rtg.library only sets the MAPP_IMPRECISE + MAPP_NONSERIALIZED bits when calling SetProperties(), and not the MAPP_CACHEINHIBIT.

    According to the mmu.library docs MAPP_CACHEINHIBIT is required for correct operation.


    Ofc - it might just be me doing something silly ;-)

  • What driver are you writing?


    Cacheinhibit will only be required if you do a mix of CPU and Blitter movement. Ideally, it does not make a difference, as read-modify-write operations with the CPU are way too inefficient and you should avoid them by any means.


    If you have a "no blitter" mode where the CPU truly needs to read from GFX mem, cache inhibit is still not a good idea, as the CPU is the only writing entity to GFX mem.


    So you will really only need cache inhibit if you do a mix of CPU and blitter operations. This will make it sufficiently slow to push yourself towards avoiding the CPU, allowing cache again. So.. from my point of view, everything is fine with this setting.

    Ofc - it might just be me doing something silly ;-)

    Cache coherency is a complex topic. I woudn't go as far as calling mixed CPU/blitter operations "silly". In my view, it's an intermediate development step.

  • This is the driver for the RTG board in Amiga core for the FPGAArcade Replay. A "virtual" graphics board, implemented in VHDL. It does have a blitter.


    From my end I see a behavioral difference after the change to use mmu.library.

    The old rtg.library would (seemingly) set the area imprecise/cacheinhibit.

    The new rtg.library seemingly only sets the IMPRECISE+NONSERIALIZED bits when calling mmu.library/SetProperties().

    Afaik there is no "imprecise+copyback" on the 060, but perhaps that's just me reading the 68060 manual incorrectly.


    It sounds like you're confirming rtg.library is calling SetProperties() with only IMPRECISE+NONSERIALIZED and not CACHEINHIBIT?

    The mmu.library documentation says that this will not produce a correct result.

    From mmu.doc:

    Code
    1. MAPP_IMPRECISE - The page will be marked as
    2. "imprecise exception". MAPP_CACHEINHIBIT is mandatory
    3. in this case or this flag does nothing. Only avail-
    4. able for the 060, ignored and read as zero
    5. by all others.
    6. MAPP_NONSERIALIZED - The page will be marked as
    7. serialized. MAPP_CACHEINHIBIT is mandatory if this
    8. property is selected. Only available for the 040,
    9. ignored and read as zero by all others.
  • We're running a true 68060 rev6, no EC/LC model. But that's really beside the point.


    If you're calling mmu.library/SetProperties() with only IMPRECISE+NONSERIALIZED (and not setting the CACHEINHIBIT bit) then the doc (and my observation) suggests that it's not doing what it's "supposed" to be doing.


    Can you please double check the parameters passed?

  • I had to talk back with Thor about some details:


    P96 behaviour with the mmu lib present has not changed - it's only the "manual trickery" that has been removed. The driver for P-IV had the same trickery, and has also been removed.


    Cache stuff is more complex - as I wrote earlier:

    The decision if cache is on or off is ultimately done by the CPU library, or in case of Thor's CPU libs it's the mmu lib during booting: Anything that does not present itself as a memory expansion (i.e. "for freemem list") will be marked "cache inhibit" because there's a chance of IO registers in that area. This is the case for all gfx cards, as none of that memory appears in the exec memory list.


    This is "step one".


    "Step two" is optional: The card driver *can* tell P96 to ignore the write order if it knows that it's "only memory" and not IO registers. That's where

    NONSERIALIZED and IMPRECISE are potentially set. As you may know, memory does not care in what order it's being written to, but IO registers may do: Remember the Tseng chip on the Merlin card? This has the ability to map in it's Blitter Accelerator registers into the VGA memory area, and if that is marked "NONSERIALIZED", it will fail (that was the case prior to the fix). We've corrected that with the last update, and customers reported that previous bugs are fixed with this.


    If for some reasons cacheing has been switched on previously, then NONSERIALIZED and IMPRECISE have no meaning anyway because, again, memory does not care in what order it's being written to. After all, it's called *Random* Access Memory :-) (and you have quoted the documentation yourself where it's explained that for these bits to have a meaning, it's mandatory to clear cache inhibit).


    So unless you've created shared memory in your system that for some reason is in the freemem list, but can also be used as gfx mem, you're good. If you have such shared memory, you may need to take care of cache inhibit yourself.

  • Quote

    P96 behaviour with the mmu lib present has not changed

    This is correct. What I'm saying is that the P96 behavior without having mmu.lib present has changed.


    If you boot a system with an older rtg.library, without having installed mmu.lib, and the driver requests BIF_CACHEMODECHANGE, then the memory area reported via MemorySpaceBase/Size will be marked CACHEINHIBIT.


    This is not what happens with the new rtg.library (when having mmu.lib installed); the area is not marked CACHEINHIBIT.

    As outlined in the first post, there is also no warning/error shown if rtg.library cannot find mmu.library.


    Quote

    So unless you've created shared memory in your system that for some reason is in the freemem list, but can also be used as gfx mem, you're good. If you have such shared memory, you may need to take care of cache inhibit yourself.

    This is exactly the situation I have; The memory used for RTG is "any" memory, which (if mmu.lib is installed) defaults to COPYBACK.

    I already have the workaround you suggest in the CARD driver today, because it was required when using the older rtg.library together with mmu.lib.


    But what I'm really confused about is "what does IMPRECISE w/o CACHEINHIBIT actually mean?".

    Reading the 68060 manual I can't find any support for marking pages like that. Maybe I'm blind :-)

    To me there are only 4 modes : CACHED (copyback), CACHED (writethrough), CACHEINHIBIT (precise), CACHEINHIBIT (imprecise).

    So what do you suggest this cache-enabled-but-imprecise mode actually is?


    That's why I'm suggesting you need to also provide the CACHEINHIBIT flag together with IMPRECISE+NONSERIALIZED, when calling mmu.lib/SetProperties, because not doing that will be a no-op (as per mmu.lib documentation).

  • Maybe another way to look at it:

    With the old rtg.library (before the "introduction" of mmu.lib) the "contract" (from my perspective, at least) was that when the driver requested BIF_CACHEMODECHANGE, this meant the memory would switch to IMPRECISE and CACHEINHIBIT.

    That's at least the observation I had at the time.

    Granted - I have not explicitly confirmed this to be true (by disassembling the code, or looking at the MMU table afterwards).


    I think you're suggesting that this wasn't (and still isn't) true, but the contract was/is to merely switch from PRECISE to IMPRECISE (if, and only if, the area was already marked CACHEINHIBIT)?

  • Alright.

    In order to actually verify what rtg.library did to the MMU table prior to v3.2.4 I dumped the MMU tables using mmulist, before and after loading my card driver.


    The result looks like this:


    The RTG memory area is between 01010000 and 0140F000, which is part of system fastram (01000000 - 03FFF000).

    As expected the area goes from "copyback" to "imprecise" (which means non-cached imprecise).


    To me this indicates that the contract for BIF_CACHEMODECHANGE is to actually change the memory area to "Noncachable, Imprecise".


    If I do the same exercise with P96 v3.2.4 (obviously with mmu.lib installed this time):


    i.e. in this case the MMU flags for the memory region used by the RTG card do not change.

  • Please take a look at the P96 documentation (guide):


    Code
    1. DoNotSetMMU: if set to 'Yes', some MMU optimisations on 040 and 060 based
    2.             systems are not performed. This will lead to a performance
    3.             reduction to some degree but might be necessary with certain
    4.             system setups. Experiment with this if you keep getting
    5.             problems that seem to be specific for your system only.


    This contains two key points:


    1) The flag for setting MMU mode in the board driver can always be overridden by the user, so it is definitely NOT suitable to set any area as "Cache inhibit" if the CPU library didn't do that in the first place.


    2) the text above clearly says that this is about *optimizations*, but not about a "correct" cache handling for gfx cards. After all, it's not P96's task, but the CPU library or the driver itself must make sure that cache handling is right.


    Please make sure to get your information from available documentation. MMU library, CPU library and P96 are openly documented, so there is no reason to start trying things and arguing over results of trial-and-error. We are publishing P96 documentation to AVOID trial-and-error. Things have changed since we took over P96 :-)

  • Oh, I wasn't suggesting using trial-and-error as a basis for understanding the operation.

    I was merely trying to confirm the operation based on what the docs suggest the operation actually is (or rather, supposed to be).


    I'm referring to the currently available online documentation hosted by icomp.de - nothing else.


    So if I understand you correctly BIB_CACHEMODECHANGE is

    a) actually not guaranteed to change the memory page to IMPRECISE/NONSERIALIZED

    b) only a hint/suggestion to P96; not a "requirement" as such

    ?


    I'm still not sure what negative implications there would be if rtg.library simply also provided the MAPP_CACHEINHIBIT flag when calling mmu.lib/SetProperties.

    To me this would actually make the operation in line with the documentation (linked above).


    Is there any negative consequence of doing that?

  • Ok, I see you've now changed the documentation re re BIB_CACHEMODECHANGE; would've been a bit more honest to acknowledge that incorrect documentation up front.


    I assume that's also why a missing mmu.library will silently fail?

  • So if I understand you correctly BIB_CACHEMODECHANGE is

    a) actually not guaranteed to change the memory page to IMPRECISE/NONSERIALIZED

    b) only a hint/suggestion to P96; not a "requirement" as such

    ?

    I believe that I have elaborated enough on the relation between cacheinhibit and Imprecise/nonserialized.


    Is there any negative consequence of doing that?

    As answered before: You need to know if there are other bus masters on the gfx mem, and if there are IO registers in the area. It is NOT the responsibility of P96 to assure that, as only the hardware manufacturer can answer these questions. So please don't ask me - I am not making that hardware.


    Ok, I see you've now changed the documentation re re BIB_CACHEMODECHANGE; would've been a bit more honest to acknowledge that incorrect documentation up front.

    Seems like Thor has moved documentation from the developer archive to the Wiki, correct. I must admit that it can be very confusing if documentation in the archive is different from the Wiki. Prior to today's changes (which add over 8k of text), the Wiki only covered the December 2021 version of P96.


    I assume that's also why a missing mmu.library will silently fail?

    It will not open a requester or even do any action because P96 must assume that the MMU config is correct to begin with. P96 is not there to ensure memory and IO access is actually working - this task is to be completed by a different entity (OS, MMU lib, CPU lib or hardware-init tool by the hardware manufacturer).


    It would not be "clean" to correct any mistakes, and the mere possibility for the user to manipulate cache settings may even qualify as "feature creep", with negative side-effects if users are attempting to "optimize" things that shouldn't be done this way. We may or may not remove things in the future if they turn out to backfire in terms of support - the goal is to provide something that "just works". So while you *could* argue "with great features comes great responsibility", it is sometimes better to limit user control in order to reduce the possibility for error.

  • Quote


    P96 is not there to ensure memory and IO access is actually working

    Right - I see your point.

    My perspective was that it used to do that (to some degree at least), and the (old) docs supported that.



    In attempt to summarize :
    In 3.2.4, as it now relies completely on mmu.library, the meaning of BIB_CACHEMODECHANGE has changed to merely be a hint/suggestion - never a guarantee.

    With previous versions, and when mmu.library wasn't present, P96 would set the MMU flags directly. This mechanism isn't supported anymore.