Reporting Issues
Bug 2643 - Externally loaded runtime images are mapped RWX
Summary: Externally loaded runtime images are mapped RWX
Status: CONFIRMED
Alias: None
Product: EDK2
Classification: Unclassified
Component: Code (show other bugs)
Version: Current
Hardware: All All
: Lowest normal
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-29 18:50 UTC by Vitaly Cheptsov
Modified: 2023-03-12 12:21 UTC (History)
8 users (show)

See Also:
EDK II Code First industry standard specifications: ---
Branch URL:
Release(s) the issue is observed: UDK 2008
The OS the target platform is running: ---
Package: MdeModulePkg
Release(s) the issues must be fixed: EDK II Master
Tianocore documents:


Attachments
Image missing from memory attributes table (281.84 KB, image/png)
2020-03-29 18:51 UTC, Vitaly Cheptsov
Details
Linux mapping image as RWX with OVMF (368.89 KB, image/png)
2020-03-29 18:52 UTC, Vitaly Cheptsov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vitaly Cheptsov 2020-03-29 18:50:18 UTC
The title says it all basically. Commit 4a723ed258367471eac8b4ae32558f09ef65672e[1] introduced a regression, resulting in all drivers loaded after EndOfDxe to be mapped as RWX. This happens because the new image is not added to the image list[2], and then EnforceMemoryMapAttribute defaults to nothing for RuntimeCode memory[3]. As a result the memory attribute the operating system gets is just runtime (highest bit set), all the memory permission attributes are cleared.

According to UEFI specification 2.8A, Table 16. Usage of Memory Attribute Definitions, the behaviour for such attributes is "No memory access protection is possible for Entry". I.e. this is essentially RWX. Linux kernel is unhappy about it[3], but it adheres to the spec and the entry is actually mapped as RWX. There is a warning ("Entry attributes invalid: RO and XP bits both cleared"), which is not strictly correct as the spec does define this, but it does not affect the kernel behaviour.

We can confirm that at least OVMF (edk2-stable202002) and VMware (Fusion 11.5.2 15794494) firmwares are affected, as they are based off the recent EDK II. Other modern firmwares should also be affected. The easiest way to check is to load some runtime driver from EDK II Shell, and then boot Linux kernel with efi=debug. You will see something like:
[    0.189743] efi: memattr: Processing EFI Memory Attributes table:
[    0.189743] efi: memattr: Entry attributes invalid: RO and XP bits both cleared
[    0.189745] efi: memattr: ! 0x00000e48f000-0x00000e49afff [Runtime Code       |RUN|  |  |  |  |  |  |  |   |  |  |  |  ]

We are strongly concerned on how this is going to be addressed as the risks of an incorrect fix (for the second time) may break our software. We have runtime drivers loadable by the bootloader, and these drivers are to be executed during operating system runtime. For this reason the behaviour to prohibit OPROMs and other runtime drivers from being executable in the operating system is unacceptable to us. This is what the culprit commit[1] tried to do, but essentially failed. In our opinion this idea is harmful and seems to violate the specification in the first place, as it does not prohibit loading runtime drivers after EndOfDxe (see 2.1.4 UEFI Drivers for instance).

We believe that restricting SMM communication buffers to the regions of the drivers loaded before EndOfDxe is understandable. However, this should only mean that after EndOfDxe memory protection attributes should split. From there on we are to have:
- the attributes accessible to SMM, these could possibly only include EndOfDxe drivers.
- the attributes accessible to an OS, these must include drivers loaded after EndOfDxe (as required by the spec).

While unlikely, we are not positive that this problem cannot negatively affect any firmware implementations, so we are using a security channel for this bugreport. Given the nature of the problem we request this to be resolved in edk2-stable202005. Thanks for understanding!

[1] https://github.com/tianocore/edk2/commit/4a723ed258367471eac8b4ae32558f09ef65672e
[2] https://github.com/tianocore/edk2/commit/4a723ed258367471eac8b4ae32558f09ef65672e#diff-ddaea0dacd2306db98abefc701d9d05fR1084
[3] https://github.com/tianocore/edk2/blob/9d510e6/MdeModulePkg/Core/Dxe/Misc/PropertiesTable.c#L251-L253
[4] https://github.com/torvalds/linux/blob/4febfb8/drivers/firmware/efi/memattr.c#L69-L72
Comment 1 Vitaly Cheptsov 2020-03-29 18:51:26 UTC
Created attachment 499 [details]
Image missing from memory attributes table
Comment 2 Vitaly Cheptsov 2020-03-29 18:52:01 UTC
Created attachment 500 [details]
Linux mapping image as RWX with OVMF
Comment 3 Laszlo Ersek 2020-03-30 16:14:49 UTC
I'm trying to wrap my brain around the commit message in 4a723ed25836:

    We want to provide precise info in MemAttribTable
    to *both OS* and SMM, and SMM only gets the info at EndOfDxe.
    So we do not update RtCode entry in EndOfDxe.

(emphasis mine).

In other words, the patch doesn't seem to be about better protection for SMM -- SMM *already* restricts itself to the runtime-marked areas that come from platform drivers (i.e. those that are dispatched before EndOfDxe). Instead, the patch seems to aim at "saving" the OS from thinking that *any* runtime-marked memory area is accepted by SMM drivers for parameter passing, when in fact those SMM drivers will reject RT-areas established post-EndOfDxe.

Original patch set:

[edk2] [PATCH 0/6] Check untested memory and EFI_MEMORY_RO
http://mid.mail-archive.com/20180720052626.24932-1-hao.a.wu@intel.com
https://lists.01.org/hyperkitty/list/edk2-devel@lists.01.org/thread/HSEL6572KYE5DLSNILBBLFZ46E7AB5NX/

It seems like the first 5 patches in the series implement the protection from SMM, and the last patch makes sure that the OS knows exactly what SMM will accept -- at the cost of not seeing runtime drivers loaded after EndOfDxe.

I wonder what the impact is if we remove patch#6 (the subject commit) from the picture. What breaks then? I'm sure I'm missing something -- otherwise the problem could be solved with a new FeaturePCD (for gating the patch#6 changes).
Comment 4 Laszlo Ersek 2020-03-30 16:20:30 UTC
typo:

> It seems like the first 5 patches in the series implement the protection
> from SMM,

s/from SMM/for SMM/
Comment 5 Marvin Häuser 2020-03-30 17:46:19 UTC
Laszlo,

Thanks for your comment!

Personally, I don't believe a PCD is required, the commit message is outright wrong about the impact. While it states that "if 3rd part OPROM is runtime, it cannot be executed at UEFI runtime phase.", what in fact happens is that it gets RWX permissions across the entire image area rather than RW / RX depending on section type. In consequence, it is executable just fine, however security is degraded by reducing permission granularity.

In contrast, if patch 6 was supposed to be "fixed" to align with the description, gBS->LoadImage() will still accept RT drivers past EndOfDxe, which will probably crash the OS at some point assuming the driver will not be mapped at all. Furthermore, we see no reason to prohibit loading RT drivers after EndOfDxe, however consider SMM refuting communication buffers allocated after EndOfDxe a good design decision we agree with.

I think the patch should just be dropped. The only impact is that the information SMM and the OS have is out of sync, but as said above, this is a good thing. However, this probably needs to be documented, i.e. that SMM cannot rely on the table to be complete, which it currently does not and I don't believe it really needs to either.
One thing that does need consideration is prohibiting to *remove* drivers past EndOfDxe - the driver, as far as I can see, will be unloaded anyway, just the table will continue to report the memory regions. I think that is a good idea to not risk SMM having stored information about a communication buffer in this area (and the area being marked free), however the preservation will not be reflected in UEFI Memory Map (only in the Memory Attributes Table). Am I missing something?

Regards,
Marvin
Comment 6 jiewen.yao 2020-03-31 04:08:08 UTC
Vitaly
Thanks to report this.

This issue has been discussed in 2017, https://bugzilla.tianocore.org/show_bug.cgi?id=751, when we create the patch series.

Let me explain one by one.

1) Why we need stop reporting after EndOfDxe.
The reason is the protection of RT code becomes useless if we allow RT code after EndOfDxe.

The security requires for RT memory:
A) RT code must be read-only, protected by hypervisor.
B) RT code must be not accessible by SMM, because SMM can bypass hypervisor protection.

The SMM can only reply on the memory map at end-of-dxe, because after that the memory information is NOT reliable.
If there is more RT code loaded, even the MAT table report it as RO and hypervisor setup RO in the page table, the SMM can still update it, because from SMM perspective, it is RT data and it is accessible.

That breaks the security requirement.


2) About the potential compatibility issue.

We already noticed the potential risk and discussed in 751 as well.
People tend to agree that this is the good solution in 2017/2018.
On one hand, we have to satisfy the security requirement.
On the other hand, nobody could see the real use case that we need load RT driver after EndOfDxe, which is a big surprise.

Actually, you are the first people to report the RT code loading after EndOfDxe.

3) About the patch in 4a723ed258367471eac8b4ae32558f09ef65672e.

We create the patch in Nov, 2017, and commit in September 2018. 

I created the dummy RT driver and validated the 4a723ed258367471eac8b4ae32558f09ef65672e.
I uses a tool to dump MemoryAttributeTable.
It works as we expected - the dummy RT driver is marked as *RT data*.

4) About the current status - stable202002 - 4c0f6e349d32cf27a7104ddd3e729d6ebc88ea70

I use the same tool to check the latest status, but I find that the attribute is 0. It matches what you have describe.
I am equal surprised on this behavior.

It is unclear to me if that is on purpose or by mistake.
I will include more people to figure out what happened between 2018 and 2020 the latest one.

5) I don't think we can simply revert the patch.
I am happy to discuss with you on the use case and security requirement to figure out the best way to satify both.

Would you please share more information on why you need load RT code after EndOfDxe? Is that possible to load it early?
What is security requirement you need to have for loading RT after EndOfDxe? e.g. do you concern about the attack from SMM ?

Thank you
Yao Jiewen
Comment 7 jiewen.yao 2020-03-31 05:57:58 UTC
OK. I understand why you see a 0 attributes for RT_CODE and I see a XP attributes for RT_CODE now. It depends upon if the platform enables gEfiMemoryTypeInformationGuid.

If a platform enables gEfiMemoryTypeInformationGuid, then the RT_CODE is allocated in a BIN. The BIN is marked as XP.

If a platform does not enable gEfiMemoryTypeInformationGuid, then the RT_CODE is allocated in a free memory. Then this memory attribute is marked as 0.

I think both tags have same issue.
Comment 8 jiewen.yao 2020-03-31 07:37:50 UTC
Forgot to mention, we also have audit feature the MAT table and SMM page table to see if they match. If there is RT_CODE in MAT but the region is accessible in SMM page table, the audit will fail because this is known vulnerable.
Comment 9 Vitaly Cheptsov 2020-03-31 11:24:12 UTC
Hello,

Thanks for your input!

First, I need clarification about the memory protection Jiewen mentions.

To me it looks exactly like Laszlo says, the patch can just be reverted at no extra security risks:

> A) RT code must be read-only, protected by hypervisor.
> B) RT code must be not accessible by SMM, because SMM can bypass hypervisor protection.

If I ignore the hypervisor part, both these statements make good sense to me. RT code should never be executable, and SMM should never access, let aside execute, any areas loaded after EndOfDxe.

A) The operating system sets up a page table when it loads. In this page table UEFI memory including code is mapped according to the latest memory map and memory attributes. The operating system bootloader has full control on the memory map and memory attributes and can freely decide what to map. Of course, a proper operating system should respect memory attributes and should map RT code as executable and read only, but this is irrelevant for the firmware.

B) SMM code has its own isolated memory and its own page table. This page table is created based off memory map and memory attributes from CoreDxe right before EndOfDxe. This page table also does not change since BDS phase. Therefore any changes to the memory map or the memory attributes cannot affect the page table in SMM. As a result, whether or not any entries are added or removed, SMM will have its own rules about what it can access. These rules are defined before EndOfDxe.

Now the hypervisor part. What hypervisor are you talking about?

- If it is the one created by the operating system bootloader (e.g. Windows VBS), it is irrelevant here. It simply does not affect anything in A and B. The operating system is free to do what it wants (including protecting RT code with a hypervisor), and SMM will not access what it should not whether or not a hypervisor is in action.

- If it is the one created by UEFI firmware itself, I do not know of any examples of such hypervisors in the wild. However, this still is irrelevant as SMM will simply not have the memory map updates after EndOfDxe. Even if the hypervisor does not protect it, SMM will still not access it, and for the rest the protection of RT code loaded after EndOfDxe is the operating system problem in any case.

All in all, I agree with the two requirements, but I do not agree with the fact that anything breaks if we revert the commit. All that changes is that we will provide a proper memory map and memory attributes to the operating system (finally). Please provide more details on your concerns as I may be missing some important parts from your description.

Second, I must really emphasize on what SMM is.

If *somehow* SMM decides to update its page table after EndOfDxe, this is a very serious bug in SMM code in the first place. SMM must never ever try to check for updated memory attribute table after EndOfDxe.

If it does, not marking the newly loaded RT drivers as executable within CoreDxe will not prevent the intruder from directly updating the relevant memory within its own bootloader, driver, or multiple drivers. I really hope this is clear and this is not the case.

You mentioned the auditing code, and my belief is that it perform the check described as follows: all memory present in SMM page table should be present in memory attributes table with the same attributes. I.e. memory present in MAT (including RT code) may not be present in SMM, this is fine, but stuff like unloading runtime drivers or freeing RT memory is not fine, and should result in boot failure.

Third, regarding our use case.

To be honest, our use case is not particularly relevant here, as the changes committed to EDK II simply violate the written specification no matter how you look at it. Some relevant parts to ensure we build a solution that works for everyone:

0. We do not want RT code to be mapped as RWX, whether it is loaded before or after EndOfDxe :)
1. We have a runtime driver, which provides code services for the operating system to work.
2. The driver exposes function pointers through one of the following: a configuration table, a copy of runtime services table passed to the OS, or a separate mechanism used by the bootloader.
3. The driver only calls UEFI Runtime Services and accesses PMIO/MMIO hardware (by virtual pointers).
4. We do not develop firmwares, so we cannot put this driver into the firmware and execute before CoreDxe.
5. We do not always have control on the operating systems, so the driver is to be self-contained.

Regarding the last part, while we develop operating systems, incorporating certain features in the operating system could be not an option for several reasons. For example, these could be certification requirements and the amount of effort needed to develop it according to the software development life-cycles used in safety-critical software. Unfortunately I cannot go much into details, but think of something like DO-178C. To make it clear, x86 is definitely not our primary target, but we expect full compatibility.

To summarise, we want a runtime driver to be properly marked as executable from within the memory attribute table, but we do not care about SMM.

Fourthly, regarding the existing firmwares.

We indeed were able to discover few firmwares, which break runtime drivers. What can you recommend us to do with such firmwares? Our current planned workaround is to find the relevant region in the memory attribute configuration table and directly patch it to allow execution. Since we do not need write access we expect this to be more or less secure. An alternative would be building a custom memory attribute configuration table, but the required effort looks very unreasonable.

Hope it helps,
Vitaly
Comment 10 Laszlo Ersek 2020-03-31 13:56:15 UTC
(In reply to jiewen.yao from comment #6)

> The SMM can only reply on the memory map at end-of-dxe, because after that
> the memory information is NOT reliable.

OK.

> If there is more RT code loaded, even the MAT table report it as RO

OK.

> and hypervisor setup RO in the page table,

OK

> the SMM can still update it, because
> from SMM perspective, it is RT data and it is accessible.

Why is RT *code* that is loaded after EndOfDxe considered RT *data* by SMM?

Thanks.
Comment 11 Laszlo Ersek 2020-03-31 14:09:33 UTC
(In reply to Vitaly Cheptsov from comment #9)

> To summarise, we want a runtime driver to be properly marked as executable
> from within the memory attribute table, but we do not care about SMM.

(Which is why I think a FeaturePCD could be used as last resort -- in case no other common ground can be found. But I hope it will!)
Comment 12 Laszlo Ersek 2020-03-31 14:19:38 UTC
(In reply to Laszlo Ersek from comment #10)
> (In reply to jiewen.yao from comment #6)
> 
> > The SMM can only reply on the memory map at end-of-dxe, because after that
> > the memory information is NOT reliable.
> 
> OK.
> 
> > If there is more RT code loaded, even the MAT table report it as RO
> 
> OK.
> 
> > and hypervisor setup RO in the page table,
> 
> OK
> 
> > the SMM can still update it, because
> > from SMM perspective, it is RT data and it is accessible.
> 
> Why is RT *code* that is loaded after EndOfDxe considered RT *data* by SMM?

To clarify my question -- RT code loaded after EndOfDxe should be completely absent from the SMM-approved memory regions, and should not be accessed from SMM at all.

Is this because of preallocated BINs?

I.e., at EndOfDxe, SMM snapshots the RT BIN (as accessible except for the known RT code ranges), and after EndOfDxe, new RT code is loaded into the free parts of the BIN?

If that's the case: couldn't SMM remember the specific RT data areas known at EndOfDxe as permitted, and forbid everything else (both known RT code, and unused parts of the RT BIN)?

... I do agree with Vitaly that, in case a platform doesn't include the SMM stack, and/or never intends to run Windows, then breaking its spec conformance specifically for the sake of Windows Device Guard / VBS, is hard to justify.

I understand that Windows might require platform specifics that are either not spelled out in the UEFI spec, or even conflict with the spec, but those should be configurable, preferably.

Thanks!
Comment 13 Vitaly Cheptsov 2020-03-31 14:48:23 UTC
Laszlo,

I am not positive how feature PCD could work here. Since we are not the developers of the firmware we will not be able to anyhow change the PCD of the already deployed firmware.

I also am roughly aware of how Windows VBS works and do not understand, why one needs to violate the specification to support it.

Vitaly
Comment 14 jiewen.yao 2020-03-31 19:21:55 UTC
First, I hold different opinion on below. It will break the audit tool, if we revert it.

"To me it looks exactly like Laszlo says, the patch can just be reverted at no extra security risks"

Thinking about below case:
1) A platform firmware has SMM.
2) A platform includes a NIC OROM which is marked as RT_CODE, but never ask for running at OS runtime.
3) A platform enables the Memory Type Information.

With this patch, the NIC OROM is accessible in SMM page table, and NIC OROM RT_CODE is marked as RT_DATA.

Without this patch, the NIC OROM is accessible in SMM page table, and NIC OROM RT_CODE is marked as RT_CODE. --> This is a violation and it will be caught by SMM audit tool. The audit failure may also cause some other security feature being disabled. It has big impact.

==============================

Second, I agree with you that if you do not have SMM, it is OK to mark the late RT_CODE to be RT_CODE. I do not see the risk.

So, if we detect the existence of SMM and use it as an indicator to determine when we stop the record, do you think it is OK ? 

I believe most X86 firmware still have SMM.
Do you use virtual firmware only? or real firmware in your case?
You use OVMF. I am not clear it is your case, or you just want to show the example. 

To be honest, I don't like PCD as well. I will treat it as last option.
Comment 15 Laszlo Ersek 2020-03-31 19:30:09 UTC
(In reply to Vitaly Cheptsov from comment #13)

> I am not positive how feature PCD could work here. Since we are not the
> developers of the firmware we will not be able to anyhow change the PCD of
> the already deployed firmware.

Oh, you are right; sorry about my mistake. I forgot that the platform is basically handed to you!

> I also am roughly aware of how Windows VBS works and do not understand, why
> one needs to violate the specification to support it.

I suspect this is indeed an implementation detail, not an inherent requirement. My questions in comment 12 aim to "pry" at this; I hope those questions make sense.

Basically the threat I imagine the current implementation might want to defend against is SMM whitelisting *more* RT room (= all of the RT data BIN) at EndOfDxe than what is actually *populated* at EndOfDxe. So after EndOfDxe, further (3rd party) stuff could be loaded into that same "room", including RT *code*, which SMM would not refuse to look at, as communication buffers.

If this is a correct suspicion, then the real problem is that SMM whitelists too much at EndOfDxe -- it should not whitelist preallocated (BIN) RT areas that aren't actually populated at EndOfDxe.

Then again, I could be 100% off. I hope Jiewen will enlighten us!
Comment 16 jiewen.yao 2020-03-31 19:37:04 UTC
Reply to Laszlo's comment:

"I.e., at EndOfDxe, SMM snapshots the RT BIN (as accessible except for the known RT code ranges), and after EndOfDxe, new RT code is loaded into the free parts of the BIN?"

-- Yes. You get the point.


"If that's the case: couldn't SMM remember the specific RT data areas known at EndOfDxe as permitted, and forbid everything else (both known RT code, and unused parts of the RT BIN)?"

-- Yes, SMM already snapshot the memory layout.
However, SMM *cannot* know what is used or *unused*. The memmep just reports there is a big chunk of memory e.g. 4M, there is no information to tell you that 2M is used and 2M is unused. 

If we enable Memory Type Info, then late RT code is marked as RT_data. SMM treat it as RT data. That is fine in current situation. If we change it RT_Code, it can satisfy the runtime usage. But if SMM modify the RT_Code because SMM can access it, then a vulnerable SMI handler may inject bad code to the RT_Code to attack the VMM. This is a threat we want to prevent.

If we disable Memory Type Info, then late RT code *can* be marked as RT_Code. (I treat current 0 in MAT is a bug). Because this region is in Unused in the snapshot and Unused is marked as NOT-PRESENT in SMM, I do not see security risk.

I see most firmware is enabling Memory Type info.
Comment 17 Laszlo Ersek 2020-03-31 19:39:10 UTC
(In reply to jiewen.yao from comment #14)

> 1) A platform firmware has SMM.
> 2) A platform includes a NIC OROM which is marked as RT_CODE, but never ask
> for running at OS runtime.
> 3) A platform enables the Memory Type Information.
> 
> With this patch, the NIC OROM is accessible in SMM page table, and NIC OROM
> RT_CODE is marked as RT_DATA.
> 
> Without this patch, the NIC OROM is accessible in SMM page table, and NIC
> OROM RT_CODE is marked as RT_CODE. --> This is a violation and it will be
> caught by SMM audit tool.

If a runtime driver is loaded, after EndOfDxe, from a NIC option ROM, then why is that accessible to SMM *at all*?

The runtime driver from the NIC option ROM should be loaded into free memory after EndOfDxe. That free memory was (I assume) also free *at* EndOfDxe, so SMM should not map it either way.

Is it that the runtime driver (from the NIC option ROM) is loaded into free space inside the RT BIN, and SMM maps the whole RT BIN as possible communication buffer?

What if we finalize (mark as full) the RT BIN at EndOfDxe, if we have SMM enabled? Then further runtime drivers will have no choice but to be allocated from free RAM -- and SMM does not map that.

Basically, up to EndOfDxe, all runtime stuff should be allocated from the RT BIN (hence the need for MemoryTypeInfo I guess), while after EndOfDxe, all runtime stuff should be allocated outside of the RT BIN.

(Apologies if my comments are completely wrong!)

Thanks!
Comment 18 jiewen.yao 2020-03-31 20:30:09 UTC
I think this is based upon the default unused BIN type.
If we use RT_Data as unused BIN type, so SMM can access.
If we use RT_Code as unused BIN type, the SMM cannot access. But the risk is that there will be a chunk of unused RT memory marked as executable. Based upon the result that we see the memory *might* not be zero-ed. There might be some gadget there for ROP attack.

I summarize the option below:

Possible options for late RT Driver, if we patch MAT later.
+-----------+-----------------+-----------------+-----------------+-----------------+
|MemTypeInfo|    MAT (final)  |   MAT(Snapshot) |    SMM Paging   |      Result     |
|           |RT Code |RT Data | Unused BIN type |                 | VMM/OS |  SMM   |
+-----------+-----------------+-----------------+-----------------+-----------------+
|  TRUE     |RT_Code |RT_Data |     RT_Data     |    Not-Exec     |  Exec  |  Acc   |(1)
|  TRUE     |RT_Code |RT_Data |     RT_Code     |    Not-Pres     |  Exec  |Not-Acc |(2)
+-----------+-----------------+-----------------+-----------------+-----------------+
|  FALSE    |RT_Code |RT_Data |   Not Present   |    Not-Pres     |  Exec  |Not-Acc |(3)
+-----------+-----------------+-----------------+-----------------+-----------------+

Impact:
#1: SMM Audit fail because SMM can access RT_Code.
#2: MAT Audit fail because MAT reserves unnecessary RT_Code.
#3: Not a secure configuration.

Thought:
#1: This is security enforcement. I dont believe we can change.
#2: Not sure if there is any security risk. There might be gadget there.
#3: Not a good configuration.
Comment 19 jiewen.yao 2020-03-31 20:38:38 UTC
I still want to get clear on the requirement at first.

From Vitaly:
"To summarise, we want a runtime driver to be properly marked as executable from within the memory attribute table, but we do not care about SMM."

What does "we do not care about SMM" mean?
Do you mean your solution does not need SMM ?
Or your solution may include SMM or might not include SMM ? Both case need to be supported?
Comment 20 jiewen.yao 2020-03-31 23:50:48 UTC
I suggest we discuss the options in the infosec meeting to reach the consensus - if we can switch from #1 to #2 to mark unused RT code memory in BIN to be RT code in MAT.

I prefer we get feedback from Microsoft to see if VBS is happy with that.
Comment 21 Vitaly Cheptsov 2020-04-01 00:59:16 UTC
Good morning,

Thanks for all the comments. I agree to discuss it on the meeting. Please let us know about the possible schedules for it.

Regarding "we do not care about SMM", I am afraid it may have been unclear, sorry. What I was trying to say is:

- SMM may or may not be enabled.
- Memory Type Information may or may not be enabled.
- Windows/Linux/etc may or may not run on this machine.

We do not know if either of these options is enabled or not, but it must work in all cases. Since I talk here for two projects: the one I do at work and the one I run as hobby, I can also say that the driver may be loaded when Windows or Linux is in use, and I want us to consider this too.

I believe I now see the problem, which I sort of missed, because Memory Type Information is not something in the specification (or I am blind). Please correct me if I got this wrong, as I believe it is crucial for understanding.

Let's say SMM and Memory Type Information are enabled. In this case all the memory for RT code/RT data is allocated from preallocated BINs, fixed-size memory areas, the limits of which are stored in gMemoryTypeInformation variable, which itself is derived from gEfiMemoryTypeInformationGuid HOB provided by PEI phase. As a result if a driver after EndOfDxe allocates RT code from the bin, currently the situation may look as follows:

1. BIN looks like valid RT code in memmap.
2. BIN is treated like valid RT code in MAT.
3. BIN is treated like valid RT code In SMM.
4. Our driver loads into BIN.
5. RT driver is valid RT code in SMM (WTF).

Consider this, to my eyes there are several incorrect parts here:

1. Why does SMM assume that the whole BIN is valid?

Just like Lazslo says, SMM should be notified at EndOfDxe that RT code/RT data should no longer be updated. For instance, it could get an SMI with the actual BIN start/end addresses, and for security they must be within the original BIN.

This way:
- Our driver is in the BINs (RT code and RT data).
- Our driver in MAT is executable.
- Our driver is not present in SMM.

I propose this as a solution to this problem and do not see the obvious downsides.

2. Why do we allocate RT code from BINs after EndOfDxe?

In my understanding, the specification requires that allocated RT code is executable within the operating system. As a result I can use AllocateAddress, bypass the BINs, and expect my code to run in the OS.

- This should work, be that Windows with VBS or any other OS.
- This area should not be present in SMM.
- This area should not really be RWX (but currently it will, and I believe this can be treated as a vulnerability as well).

Perhaps the firmware should expose a protocol to restrict the attributes of the allocated area in the MAT, a protocol or alike.

Best regards,
Vitaly
Comment 22 jiewen.yao 2020-04-01 11:53:44 UTC
Response to vitaly:
1. Why does SMM assume that the whole BIN is valid?

Just like Lazslo says, SMM should be notified at EndOfDxe that RT code/RT data should no longer be updated. 

[Jiewen] yes. This is what we have implemented today. SMM only gets the snapshot at EndOfDxe, once.

What is the concern?


2. Why do we allocate RT code from BINs after EndOfDxe?

As a result I can use AllocateAddress, bypass the BINs, and expect my code to run in the OS.

[Jiewen] yes. You can do it. It should be OK if there is no conflict.

But it bring some downside on S4 inconsistency. If you success in one boot and fail in another boot, just because the address is already allocated. Then you have to allocate another address and the memory map is different with previous. Then you may fail the S4 resume.
Comment 23 Vitaly Cheptsov 2020-04-01 12:29:54 UTC
Jiewen,

Maybe we misunderstood each other?

What you have today is *one cannot allocate RT code in BINs after EndOfDxe*. This is broken in OVMF one way (you get RWX memory with and without SMM), and in Intel RC apparently the other way (you get RW memory).

Since EFI_BOOT_SERVICES LoadImage allocates in BINs, Runtime drivers are broken after EndOfDxe. I feel stupid to repeat this, but that is indeed a problem we are trying to resolve here.

After your explanation about the BIN handling I and Laszlo suggested that this is broken, because SMM improperly assumes that the whole memory in the BIN is secure and useable, and as a result we cannot add anything to it.

You posted a table of the options to choose from, but for some reason did not include the way to do it right for the driver loaded after EndOfDxe:

MemTypeInfo → TRUE
MAT (final) → RT Code / RT Data
MAT(Snapshot) → Unused BIN type
SMM Paging → Unmapped
VMM/OS → Executable
SMM → No access

This looks conformant to the specification and secure to me.

Your claim with S4 is valid, but I gave the AllocateAddress example to explain you that the 4a723ed2 change borks runtime drivers loaded by CoreDxe but does nothing to protect them from being loaded manually e.g. by a custom PE loader. Therefore, this cannot really be explained with security, it is just bad design that needs to be constructively discussed and fixed now.

Best regards,
Vitaly
Comment 24 Vitaly Cheptsov 2020-04-01 12:36:52 UTC
Laszlo, Jiewen, in addition to my previous letter, could you please try to explain why OVMF behaves differently from real hardware? We are currently trying to find a workaround for already deployed Intel firmwares, but it is kind of hard to reproduce it, let aside to take a grasp of what exactly happens.

In particular, I do not understand why XP bit is not set for drivers loaded after EndOfDxe in OVMF, while this happens in e.g. Z390, despite them sharing the code.

I think you understood better what is going on. This is important for us, because even if a fix is deployed in a future firmware, we will still encounter problematic ones.
Comment 25 Laszlo Ersek 2020-04-01 16:57:02 UTC
Vitaly, I'm unsure how OVMF differs (in this regard) from physical firmware.

What I can tell you though is that MemoryTypeInfo has been enabled in OVMF only very recently, and only for the -D SMM_REQUIRE build. Therefore, if you compare old vs. new OVMF (with -D SMM_REQUIRE), or else if you compare new OVMF with vs. without -D SMM REQUIRE, you might see different behavior.

The commit range in question is 7d325f93e190..799d88c1bae7.

Another difference I *suspect* could be relevant is the availability of Execute Disable / NX. Not sure about your QEMU setup, but if you use a VCPU model with NX absent (simply missing from the VCPU definition, or disabled manually), then I *guess* you could see different behavior wrt. XP in the MAT. On the physical machine, you most likely always see NX available. But, again, the NX angle is pure guesswork on my end.
Comment 26 Vitaly Cheptsov 2020-04-01 17:10:11 UTC
Laszlo,

Thanks for the input. I indeed enabled SMM_REQUIRE to test MemoryTypeInfo on OVMF. Normally we do not test on OVMF, so I am kind of a newcomer for it and may not know al the details, here is how I build it:

build -a X64 -t XCODE5 -b NOOPT -p OvmfPkg/OvmfPkgX64.dsc -D SMM_REQUIRE=1 -D DEBUG_ON_SERIAL_PORT=1

XCODE5 is not really relevant here, same behaviour with e.g. CLANGPDB or GCC5. NX bit is also enabled for OVMF, and I believe it is not much of a question.

The trick here is that on real hardware we see a MAT with RT Code entry for our driver with XP bit set. And on OVMF the XP bit is not present. I checked the diff and am a bit clueless on how MemoryTypeInfo can affect this despite Jiewen words above.
Comment 27 jiewen.yao 2020-04-01 19:04:58 UTC
OK, let me clarify my word:

When Vitaly and Laszlo say: "SMM should be notified at EndOfDxe that RT code/RT data should no longer be updated." - I agree, that is why we have implemented today.

"one cannot allocate RT code in BINs after EndOfDxe" - that is not true. You can allocate RT code successfully and it is marked as RT code in the UEFI memory map. (I know it is useless in your case, but you did get *success* in the function AllocatePages).
This RT code is marked as RT data in MAT if MemoryTypeInfo is enabled (expected behavior, but we may change based upon the discussion) or it is marked as 0 in MAT if MemoryTypeInfo is disabled (I believe it is a bug needed to be fixed).
The words "cannot allocate" is misleading here.

I believe MemoryTypeInfo enable/disable controls the final behavior - RT_Data or 0 - for the RT_Code.

To summarize my proposal for fix below:

+-----------+-----------------+-----------------+-----------------+-----------------+
|MemTypeInfo|    MAT (final)  |   MAT(Snapshot) |    SMM Paging   |      Result     |
|           |RT Code |RT Data | Unused BIN type |                 | VMM/OS |  SMM   |
+-----------+-----------------+-----------------+-----------------+-----------------+
|  TRUE     |RT_Code |RT_Data |     RT_Code     |    Not-Pres     |  Exec  |Not-Acc |(2)
+-----------+-----------------+-----------------+-----------------+-----------------+
|  FALSE    |RT_Code |RT_Data |   Not Present   |    Not-Pres     |  Exec  |Not-Acc |(3)
+-----------+-----------------+-----------------+-----------------+-----------------+

I believe it can satisfy your case, and I am looking for review by other security people to see if that is acceptable. esp Microsoft, because they discovered the untested memory issue.
Comment 28 Vitaly Cheptsov 2020-04-01 19:59:50 UTC
Jiewen,

Alright, let us not confuse successful allocations and unsuccessful execution. Sorry for that.

> This RT code is marked as RT data in MAT if MemoryTypeInfo is enabled (expected behavior, but we may change based upon the discussion)

This is not the case for QEMU and VMware:
- The page is allocated as RT Code.
- This page is marked RT code in MAT.
- This page has 0 attributes in MAT, i.e. RWX permissions (neither EFI_MEMORY_XP, nor EFI_MEMORY_RO is set).

And yes, MemoryTypeInfo is enabled at least in QEMU. I do not know, why this happens, but I believe it is important to ensure that we have this clear and fix this soon.

Z390 board behaves as you explained indeed, but I have to say that I do not understand why this should (and does) work anyhow differently from QEMU judging from the source code. Perhaps the implementation of MemoryTypeInfo in QEMU is lacking something?

> I believe MemoryTypeInfo enable/disable controls the final behavior - RT_Data or 0 - for the RT_Code.

Please clarify what you mean, I might be lost here.

> To summarize my proposal for fix below:

The table makes sense to me, but could you please confirm that I interpret it correctly?

1. Runtime driver loaded after EndOfDxe:
- Resides within the BIN, just like any other driver (if MemTypeInfo is enabled).
- Resides in some <= BASE_4GB memory, just like any other driver (if MemTypeInfo is disabled).
- Gets RT code type in MemoryMap visible to OS.
- Gets RX permissions for .text and RW permissions for .data in MAT visible to OS.
- Is not mapped in SMM (and is not accessible).

2. When MemTypeInfo is enabled, unused space (not the space of our driver) in BIN:
- Is marked as RT data in MemoryMap visible to OS.
- Gets RW permissions in MAT visible to OS.
- Is not mapped in SMM (and is not accessible).

3. When MemTypeInfo is disabled, unused space:
- Is marked as conventional memory in MemoryMap visible to OS.
- Is not present in MAT visible to OS.
- Is not mapped in SMM (and is not accessible).

If this is right, I believe it should mostly work, however, I have the following concerns:

* When MemTypeInfo is enabled, the operating system cannot use unused BIN space in any case. I suggest us to mark it as read-only and not executable. What is the reason to enable writing, which we essentially do not need?

* How can the application developer communicate with MAT? Let's say I allocated an RT code page and I want it to get RX permissions within the operating system. Currently I believe this is impossible, and all non-PE RT code pages get RWX permissions. This looks like a security risk to me.

* Can we have (at least some) RT data entries read only in MAT? The compilers already generate .rodata sections and some firmwares, e.g. MacEFI, do not write to .data sections during runtime phase at all. So macOS simply restricts any writes to UEFI memory but at the cost of marking .data executable. I believe we do not want the last part of this, but marking all RT data memory as read-only and non-executable could be an interesting feature.

* Can we respect and/or harden permissions of all loaded drivers, both firmware and external? I would personally want to hardcode W^X for everything on UEFI side during runtime phase, consider RWX MAT entries as a bug and early abort loading the suspicious drivers. See some details in bug 564.

Best regards,
Vitaly
Comment 29 jiewen.yao 2020-04-01 21:26:20 UTC
OK, let me comment inline to eliminate the confusing, hopefully.

========================
The table makes sense to me,
[Jiewen] Sounds good.

 but could you please confirm that I interpret it correctly?

1. Runtime driver loaded after EndOfDxe:
- Resides within the BIN, just like any other driver (if MemTypeInfo is enabled).
[Jiewen] TRUE.

- Resides in some <= BASE_4GB memory, just like any other driver (if MemTypeInfo is disabled).
[Jiewen] FALSE. There is no guarantee it is <= BASE_4GB.

- Gets RT code type in MemoryMap visible to OS.
[Jiewen] TRUE

- Gets RX permissions for .text and RW permissions for .data in MAT visible to OS.
[Jiewen] TRUE - that is something we should fix.

- Is not mapped in SMM (and is not accessible).
[Jiewen] TRUE - that is something we should fix.

2. When MemTypeInfo is enabled, unused space (not the space of our driver) in BIN:
[Jiewen] This sentence is unclear.
Unused Space might be RT CODE or RT DATA. Their behavior are different.
Current behavior is:
-  unused RT DATA in memmap => RT DATA in MAT => SMM Accessible
-  unused RT CODE in memmap => RT DATA in MAT => SMM Accessible
In the new proposal, the behavior is:
-  unused RT DATA in memmap => RT DATA in MAT => SMM Accessible
-  unused RT CODE in memmap => RT CODE in MAT => SMM Not Accessible

- Is marked as RT data in MemoryMap visible to OS.
[Jiewen] See above.

- Gets RW permissions in MAT visible to OS.
[Jiewen] See above.

- Is not mapped in SMM (and is not accessible).
[Jiewen] See above.

3. When MemTypeInfo is disabled, unused space:
[Jiewen] This sentence is meaningless. If disabled, then there is NO unused space.

- Is marked as conventional memory in MemoryMap visible to OS.
[Jiewen] See above.
- Is not present in MAT visible to OS.
[Jiewen] See above.
- Is not mapped in SMM (and is not accessible).
[Jiewen] See above.



If this is right, I believe it should mostly work, however, I have the following concerns:

* When MemTypeInfo is enabled, the operating system cannot use unused BIN space in any case.
[Jiewen] TRUE.

I suggest us to mark it as read-only and not executable.
[Jiewen] Good suggestion.

What is the reason to enable writing, which we essentially do not need?
[Jiewen] The reason is current MAT is derived from memmap, and memmap does not have *unused* information. - That is the purpose of memory type info hob.
The DXE Core has such information. It is possible to do one more round calculation to figure out which is used.
This is a potential enhancement area. Good point.


* How can the application developer communicate with MAT?
[Jiewen] You cannot.

 Let's say I allocated an RT code page and I want it to get RX permissions within the operating system.
[Jiewen] You cannot.

 Currently I believe this is impossible,
[Jiewen] TRUE.

 and all non-PE RT code pages get RWX permissions.
[Jiewen] FALSE
Here non-PE RT code pages == unused RT code. Please see above unused RT code behavior..


 This looks like a security risk to me.
[Jiewen] Sorry, I am not clear. What is specific security risk? Would you please clarify?


* Can we have (at least some) RT data entries read only in MAT?
[Jiewen] partial TRUE.
If we update the code and define a policy here, it is possible.
But i think we need apply same policy for *all* RT driver.
Not *partial* RT driver.


 The compilers already generate .rodata sections 
[Jiewen] partial TRUE.
Current EDKII merges the .rodata to .data by default.
See the base tool configuration: /MERGE:.rdata=.data
But that is configurable in *some* firmware.

and some firmwares, e.g. MacEFI, do not write to .data sections during runtime phase at all.
[Jiewen] partial TRUE.
That is RT driver specific. We do have example, that RT driver writes RT data at runtime. *some* firmware does, some does not.

So macOS simply restricts any writes to UEFI memory but at the cost of marking .data executable.
[Jiewen] Interesting. Good to know.

I believe we do not want the last part of this,
[Jiewen] TRUE. We dont want data region to be executable.

but marking all RT data memory as read-only
[Jiewen] FALSE. You cannot make RT data as read-only, because you may break other existing RT modules.

 and non-executable could be an interesting feature.
[Jiewen] TRUE. We want mark RT data be non-executable.

* Can we respect and/or harden permissions of all loaded drivers, both firmware and external?
[Jiewen] TRUE. We should do that by default, unless we see compatibility issues.

I would personally want to hardcode W^X for everything on UEFI side during runtime phase, consider RWX MAT entries as a bug and early abort loading the suspicious drivers. See some details in bug 564.
[Jiewen] Sorry, I do not fully understand what is proposed here.
Would you please add more detail?
Comment 30 jiewen.yao 2020-04-01 21:35:35 UTC
Sorry I did not describe clearly in the last comment. Clarify below:

===================
I would personally want to hardcode W^X for everything on UEFI side during runtime phase, consider RWX MAT entries as a bug and early abort loading the suspicious drivers. See some details in bug 564.
[Jiewen] I have read 564. It seems good enhancement.
But it seems irrelevant to this issue.
What is the detail proposal to resolve this specific RT_CODE non-exec issue ?
Comment 31 Vitaly Cheptsov 2020-04-02 07:59:42 UTC
Hi Jiewen,

Your explanations make good sense to me, thanks for the effort! I answered inline where appropriate.

> In the new proposal, the behavior is:
> -  unused RT DATA in memmap => RT DATA in MAT => SMM Accessible
> -  unused RT CODE in memmap => RT CODE in MAT => SMM Not Accessible

1. Why do we need to make unused RT DATA accessible in SMM? In my opinion, it should be completely unmapped just for the sake of reducing the attack surface. The implementation for this should come free, as we already plan to do exactly the same for RT CODE areas.

2. It is still unclear to me, which permissions will I see in MAT from this description. I expect both of these areas to be read-only and not executable (RO+XP), is that correct? If not, I believe that's what we should do.

> Let's say I allocated an RT code page and I want it to get RX permissions within the operating system.
> [Jiewen] You cannot.
> Here non-PE RT code pages == unused RT code. Please see above unused RT code behavior.
> [Jiewen] Sorry, I am not clear. What is specific security risk? Would you please clarify?

I now see my mistake, but even so I believe this is not good. If I understand you correctly the following happens after EndOfDxe when Memory Type Information is enabled:

- When I allocate RT code in any pages, I get my code allocated in BIN, so it gets unused RT code permissions (see above).
- When I allocate RT code by address, I get my code allocated outside BIN, so it gets RWX permissions.
- When I allocate RT data in any pages, I get my data allocated in BIN, so it gets unused RT data permissions (see above).
- When I allocate RT data by address, I get my data allocated outside BIN, so it gets read-write permissions.

- From the security point of view RWX is bad.
- From the specification point of view it is impossible to understand why allocated memory of the same type gets different permissions in MAT.
- From the usability point of view it is unusable, because:
* We cannot allocate RT code in the BIN, which will be executable code in the OS
* We cannot allocate RT code, which will be executable and NOT writeable in the OS
* We cannot allocate RT data, which will be readable but NOT writeable in the OS

Therefore:

1. I request RT code and RT data allocated in any pages or by address to be treated the same way regardless of whether Memory Type Information is enabled or disabled:
- RT code is present as RT code in the memory map and gets RO+XP in the MAT.
- RT data is present as RT data in the memory map and gets XP in the MAT.
- Neither of them is accessible by SMM.

In my opinion, we could also make RT data RO+XP by default, but that may break some applications and does not give too much benefit. I do not want to make RT code executable by default as well, because I suspect that some applications simply abuse RT code memory type.

2. We introduce a protocol to set the permissions of the allocated pages. The protocol will have a single function with the following prototype:

EFI_STATUS
(*EFIAPI EFI_SET_MEMORY_PERMISSIONS) (
  IN EFI_PHYSICAL_ADDRESS   Address,
  IN UINTN                  PageCount,
  IN UINT64                 Attributes
  );

This way we get good compatibility with existing application code, and also let application code to set the permissions of the allocated pages to whatever it desires. For example, to make executable the RT code it allocated.

The introduction of this protocol is secure, because currently the application code could craft the memory attributes table on its own and pass it to the operating system in any case. What we try to do is to make this simple and convenient. We should, however, make a precondition for this protocol that one cannot change the permissions of the memory he did not allocate. We do NOT need to check it within the protocol implementation, however, only optionally at some higher auditing level e.g. VMM/SMM.

If this sounds good to you I can write a more complete proposal, describing the return codes and usage. Initially this can be EDK II specific, later it can go to UEFI specification.

The existence of such a protocol is important for us, because we may load images ourselves, and we cannot use LoadImage due to PE format and/or custom cryptography for the signature. This is unrelated, but is probably worth mentioning that currently Marvin works on a new PE loader, which addresses multiple flaws and security risks of the existing implementation in EDK II. We plan to contribute it this year.

> [Jiewen] partial TRUE.
> That is RT driver specific. We do have example, that RT driver writes RT data at runtime. *some* firmware does, some does not.
> [Jiewen] FALSE. You cannot make RT data as read-only, because you may break other existing RT modules.

Like I said, some operating systems and firmwares, namely macOS/MacEFI, believe that one is not allowed to write to global variables in runtime. If you need to write anywhere, write to dynamic memory allocated as RT data. This is indeed specific, but to be honest I wonder whether this approach can be applied globally and adopted by all firmwares.

I do not expect all the people here to agree with it to be honest, but doing this will vastly simplify the permission code.

> [Jiewen] I have read 564. It seems good enhancement.
> But it seems irrelevant to this issue.
> What is the detail proposal to resolve this specific RT_CODE non-exec issue ?

While 564 indeed discusses several other issues, I am afraid that it is all connected. The relevant parts are as follows:

1. Currently a misaligned RT driver turns MAT off but is still loaded. I believe this is incorrect, and the implementation should just refuse to load such a driver.

2. Currently RT drivers with RWX sections are allowed. I believe this is incorrect, and the implementation should either refuse to load such a driver or load it them as RX.

As a result we can enforce MAT if it is enabled and refuse to boot if it cannot be used.

There is something we do not cover in this issue — UEFI time permissions of all the areas. I am not sure we should, but it is kind of worrying. Do I understand correctly, that we have RW for everything in UEFI memory map, and optionally X for select memory types depending on the PCD? This would mean that if the firmware decides to mark RT code, BS code, and Loader code as XP, only LoadImage will be able to produce executable code during UEFI time, and this will break stuff like JIT and custom image loading. In this case we can extend the protocol I suggested above to have another function to control UEFI time permissions.

Best regards,
Vitaly
Comment 32 jiewen.yao 2020-04-02 22:05:15 UTC
Comment inline:

====================
1. I request RT code and RT data allocated in any pages or by address to be treated the same way regardless of whether Memory Type Information is enabled or disabled:
- RT code is present as RT code in the memory map and gets RO+XP in the MAT.
[Jiewen] Why XP ? If you set XP, then you cannot run the code. I think only RO is enough. 

- RT data is present as RT data in the memory map and gets XP in the MAT.
[Jiewen] Agree.

- Neither of them is accessible by SMM.
[Jiewen] It brings compatibility issue.
I agree RT code should be inaccessible. But not RT data at this point of time.
I do see some RT driver uses global variable as SMM communication buffer.
I dont believe we cannot make RT data to be inaccessible, before we reach consensus in infosec team that all reference platform code does not use global variable as SMM communication buffer.
We can change RT data to be inaccessible, only after infosec team agree and all platform BIOS have been changed to eliminate such global data as SMM comm buffer usage.




2. We introduce a protocol to set the permissions of the allocated pages. The protocol will have a single function with the following prototype:

EFI_STATUS
(*EFIAPI EFI_SET_MEMORY_PERMISSIONS) (
  IN EFI_PHYSICAL_ADDRESS   Address,
  IN UINTN                  PageCount,
  IN UINT64                 Attributes
  );

[Jiewen] I am not sure if this is mandatory or optional to fix the particular issue reported here.
I am a little worried about this protocol, because there will be 2 or more masters of the MAT table. It makes things complicated.
What if one module sets the permission, and DxeCore need refresh the MAT table later? Should DxeCore cache the setting and replay? or ignore the old setting? What is the old setting has conflict with the new setting? What if module sets data region to be executable, or code region to be writable? should DxeCore reject or blindly follow?

I recommend we separate this new protocol proposal as a feature request instead of a bug fix. I think it is worth another round discussion on the behavior of the new protocol - we need think of all corner cases carefully.
Comment 33 Vitaly Cheptsov 2020-04-03 05:14:36 UTC
Jiewen,

> - RT code is present as RT code in the memory map and gets RO+XP in the MAT.
> [Jiewen] Why XP ? If you set XP, then you cannot run the code. I think only RO is enough.

I wanted to add hardening, because I believe there exists code that may allocate RT code for other purposes. This does not work without the new protocol I suggested, so I agree to make it just RO and return to this discussion in a separate issue.

> - Neither of them is accessible by SMM.
> [Jiewen] It brings compatibility issue.
> I agree RT code should be inaccessible. But not RT data at this point of time.
> I do see some RT driver uses global variable as SMM communication buffer.

For EndOfDxe allocations that should not be a problem, because we assume the drivers loaded after EndOfDxe cannot communicate with SMM. I agree to postpone changing this before EndOfDxe. After EndOfDxe this should be changed, however.

> I am a little worried about this protocol, because there will be 2 or more masters of the MAT table. It makes things complicated.
> What if one module sets the permission, and DxeCore need refresh the MAT table later? Should DxeCore cache the setting and replay? or ignore the old setting? What is the old setting has conflict with the new setting? What if module sets data region to be executable, or code region to be writable? should DxeCore reject or blindly follow?

What I want is DxeCore to install this protocol, and add user requests to the same array it currently has the loaded images. This way there will be no two masters for MAT, and it will all sync and align very cleanly.

DxeCore should blindly follow updating the permissions, as that is what the current code could do anyway by reinstalling the configuration table every time after DxeCore updates it. It will be up to the OS to decide what is adequate and what is not for it.

I agree to make this a separate issue soon afterwards this one is resolved, as I believe this new protocol is rather hardening/improvement, and the actual bug needs to be resolved first before we consider it.

---

To make sure we have agreed on the behaviour and on the changes and are ready to discuss it with other parties. I summarise everything below, please confirm that this is correct. If it is, I believe I am happy with the proposal.


1. Memory allocation before EndOfDxe(1):

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  | MAT (Snap)| MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
| Irrelevant| RT code | RT code |  RT code  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
| Irrelevant| RT data | RT data |  RT data  |  RT data |    XP(2)  |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

2. Memory allocation after EndOfDxe(1):

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  |MAT(Snap,3)| MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
|    TRUE   | RT code | RT code |  RT data  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
|    TRUE   | RT data | RT data |  RT data  |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+
|   FALSE   | RT code | RT code |  Convent. |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
|   FALSE   | RT data | RT data |  Convent. |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

3. Unused BIN types:

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  |MAT(Snap,1)| MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
|    TRUE   | RT code | RT data |  RT data  |  RT data |   Unmap   | RO+XP  |
+-----------+---------+---------+-----------+----------+-----------+--------+
|    TRUE   | RT data | RT data |  RT data  |  RT data |   Unmap   | RO+XP  |
+-----------+---------+---------+-----------+----------+-----------+--------+

(1) Irrelevant whether this is PE image or not. Any allocation, both at any pages and by address.

(2) Perhaps communication buffers should be all reserved/ACPI, and RT data should be unmapped. I am not positive we should change it right now, however, due to the global variables you mentioned. Need to discuss that at InfoSec.

(3) When MemTypeInfo is enabled, MAT snapshot refers to unused BIN type for the relevant class of allocations (RT code or RT data).

Best regards,
Vitaly
Comment 34 Vitaly Cheptsov 2020-04-03 05:26:10 UTC
Forgot to mention an important detail about the PE images.

As you know, they are allocated as a single RT code chunk, which is then split into two RT code/RT data chunks in the MAT. For this to work you should refer to "Type" column in the the tables as the "Effective Type":

- If this is an explicit allocation of this type (RT code/data), both 1st and 2nd tables apply as is.
- If this is PE, then MMAP for effective RT data (.data section) is RT code with the rest as is.
Comment 35 jiewen.yao 2020-04-03 08:24:12 UTC
Comment on the table:

=================
1. Memory allocation before EndOfDxe(1):

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  | MAT (Snap)| MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.Irrelevant| RT code | RT code |  RT code  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.Irrelevant| RT data | RT data |  RT data  |  RT data |    XP(2)  |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

2. Memory allocation after EndOfDxe(1):

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  |MAT(Snap,3)| MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.   TRUE   | RT code | RT code |  RT data  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.   TRUE   | RT data | RT data |  RT data  |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+
C.  FALSE   | RT code | RT code |  Convent. |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
D.  FALSE   | RT data | RT data |  Convent. |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

3. Unused BIN types:

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  |MAT(Snap,1)| MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.   TRUE   | RT code | RT data |  RT data  |  RT data |   Unmap   | RO+XP  |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.   TRUE   | RT data | RT data |  RT data  |  RT data |   Unmap   | RO+XP  |
+-----------+---------+---------+-----------+----------+-----------+--------+

==========================

Comment for the table:
1) I think we need add below case. This is a data section in a RT driver.
+-----------+---------+---------+-----------+----------+-----------+--------+
            | RT data | RT code |  RT data  |  RT data |    XP     |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

2) I think 3.A MMAP is incorrect. It is RT code.

Please be aware the fact that:
1) Only DxeCore knows the BIN what is used and what is unused.
2) The SMM does not know the existence of BIN and does not know the unused memory.

As result:
1) SMM cannot distinguish 1.B, 2.B and 3.B, they are exactly same from MMAP and MAT(snap). SMM will set XP for 2.B and 3.B.
2) SMM only checks MAT(snap) as a result. The RT_Code in MMAP is useless, because it might be global data region. SMM will set XP for 2.A and 3.A.

SMM cannot set different policy with same indicator.
Comment 36 Vitaly Cheptsov 2020-04-03 10:29:43 UTC
Jiewen,

I rebuilt the table according to your clarifications. I also separated PE runtime memory to avoid extra confusion. Please check it out and let us know if it is still incorrect:

1. Memory allocation before EndOfDxe:

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  | MAT (Snap)| MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.Irrelevant| RT code | RT code |  RT code  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.Irrelevant| RT data | RT data |  RT data  |  RT data |    XP     |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

2. Memory allocation after EndOfDxe:

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  | MAT(Snap) | MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.   TRUE   | RT code | RT code |  RT data  |  RT code |    XP     |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.   TRUE   | RT data | RT data |  RT data  |  RT data |    XP     |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+
C.  FALSE   | RT code | RT code |  Convent. |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
D.  FALSE   | RT data | RT data |  Convent. |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

3. Unused BIN types:

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Type    |   MMAP  | MAT(Snap) | MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.   TRUE   | RT code | RT code |  RT data  |  RT data |    XP     | RO+XP  |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.   TRUE   | RT data | RT data |  RT data  |  RT data |    XP     | RO+XP  |
+-----------+---------+---------+-----------+----------+-----------+--------+

4. Memory allocated for PE runtime driver before EndOfDxe:

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Section |   MMAP  | MAT(Snap) | MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.Irrelevant| .text   | RT code |  RT code  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.Irrelevant| .data   | RT code |  RT data  |  RT data |    XP     |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

5. Memory allocated for PE runtime driver after EndOfDxe:

+-----------+---------+---------+-----------+----------+-----------+--------+
+MemTypeInfo| Section |   MMAP  | MAT(Snap) | MAT (Fin)| SMM Paging| VMM/OS |
+-----------+---------+---------+-----------+----------+-----------+--------+
A.   TRUE   | .text   | RT code |  RT data  |  RT code |    XP     |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
B.   TRUE   | .data   | RT code |  RT data  |  RT data |    XP     |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+
C.  FALSE   | .text   | RT code |  Convent. |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+-----------+----------+-----------+--------+
D.  FALSE   | .data   | RT code |  Convent. |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+-----------+----------+-----------+--------+

---

If the table matches what you expect us to have, then I regret to tell you that we reevaluated the consequences with Marvin, and the thing does not sound right to us.

In our opinion it is not correct that SMM Paging permissions only happen based on the memory type in MAT snapshot. While we do not see absolutely disastrous risks in doing this, the risk of having user-controlled memory mapped in SMM even as non-executable is the a helper in the construction of ROP exploits. In the new table the problematic cases are 2.A, 2.B, 3.A, 3.B, 5.A, and 5.B. None of these need to be mapped in SMM at all, and we request this to be addressed as a part of this issue.

We and Laszlo have already mentioned some of the potential ways to resolve this. I provide a more complete list below ranked from best to worst in our opinion:

I. Not reporting unused BIN in MAT till after EndOfDxe signaling returns.

This would be the easiest and, if doable in a clean way, our prefered solution. Currently it seems to be done by an event handler, so maybe the code needs to be changed to do something of the following:

A. Reconstruct MAT right after the signal call instead (and include unused BIN RT data in it).
B. Expose unused BIN information in a separate Configuration Table in some other format.
C. Use trailing information with MAT entries containing.

The C variant will involve adding extra vendor-specific fields marking that the area is part of the unused BIN. We have DescriptorSize and all processing code should be well aware there is not guarantee it is constant, so this should hopefully be fine as well.

II. Turning MAT snapshot grabbing into an SMI call from DxeCore right before EndOfDxe.

This change makes SMM table grabbing to become DxeCore driven. DxeCore will make an SMI call that will report the geometry of unused parts in RT code and RT data BINs. This way SMM will get the addresses and sizes of RT data memory present in the MAT, which it should not map. So something of the following will do:

A. Send a small struct containing RT code/data BINs unused areas addresses and sizes.
B. Send an address to a copy of the postprocessed MAT table with unused BINs marked as conventional memory.

---

The alternative proposal once mentioned here in changing all the SMM communication buffers to have a different type does not really work quite right.

1) It is unclear which memory type one should choose for the SMM communication buffer.
- If it is a new type, the spec will need to be updated, and that may also require the updating of the operating systems.
- If it is an existing type, that may break the operating systems as well. For example, macOS does not assign a virtual address to the Reserved memory and considers EFI_RUNTIME on a Reserved entry to be an error.

2) Several firmwares, including EDK II itself[1] or American Megatrends, use RT data type for the SMM communication buffers, and updating them might be tedious. If EDK II is something we can update, vendors are slow and hard to communicate with.

3) Several firmwares according to your words even use global variables for the SMM communication buffers, and I am not positive they will also update immediately.

All in all, while something like AcpiNvs memory can be made to be used for SMM communication buffers, I am not positive we should really make such drastic changes. To reduce the attach surface both I and II could report the MAT with not only dropped unused BINs but also with PE RT data missing. Thils will leave directly allocated RT data (2) useable for SMM communication buffers, and global variables (3) to be prohibited.

Best regards,
Vitaly

[1] https://github.com/tianocore/edk2/blob/f73c9adfc68c7ff189b17cb2531a071d4b30cd75/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmmRuntimeDxe.c#L1573
Comment 37 jiewen.yao 2020-04-03 19:30:05 UTC
Thanks Vitaly.
I would like to clarify one more thing:
The table you list is *current* status, or the *proposal* for the fix?

Or, if you can build 2 tables - one is *current*, second is *proposal*. that can help us understand more.


=====================
the risk of having user-controlled memory mapped in SMM even as non-executable is the a helper in the construction of ROP exploits.

 In the new table the problematic cases are 2.A, 2.B, 3.A, 3.B, 5.A, and 5.B. None of these need to be mapped in SMM at all, and we request this to be addressed as a part of this issue.
=====================
I agree the *risk* in above statement. In general:
1) I am OK to eliminate the .data section. We need change the SMM drivers that uses global data as comm buffer.
I would like to listen the feedback from other people in infosec group, to see if there is any concern from compatibility consideration.
2) I am OK to create a new table to report unused memory to SMM.
3) Current valid SMM comm buffer type is: RT Data, ACPINvs, Reserved. I do see all cases before. They must be there. But we can discuss if .data belongs to RT data,  (YES in MAT, NO in MEMMAP), I am open on that.

To make thing easier, is that possible that you create a patch for the idea?
I think it will help to let other people integrate and try to see if that will break their BIOS, as part of full assessment.
Comment 38 Marvin Häuser 2020-04-03 21:43:53 UTC
Jiewen,

The tables display what we have. The changes we propose are discussed in the paragraphs below.

1) Thanks.
2) Personally, I don't see why a separate table is required. Why can we not take the omission route with reconstruction after EndOfDxe? To me, it actually makes the most sense, as all firmware consumers should not (thinking of DXE-time mapping) or just don't require unused BIN information. Even for the OS itself, it seems like mostly a S4 fix to report it.
3) We agree, just it would be a good idea to not include PE .data as handled by 1)

Thanks for your efforts, we hope InfoSec will yield some good results. As for patches, sorry, we will have to discuss this tomorrow, as it is late and we have plenty of other tasks at the same time.

Regards,
Marvin
Comment 39 jiewen.yao 2020-04-04 03:23:51 UTC
It seems I misunderstand meaning of the table.
But I don't believe it reflects the current status.
E.g. in your case (2.C) the OS/VMM is NOT RO.
Comment 40 Vitaly Cheptsov 2020-04-04 03:34:05 UTC
Jiewen, actually it is some intermediate variant. Initially the table was meant to represent the result we want to achieve, however, during the discussion I corrected it to represent your suggested fix.

This intermediate representation resolves the problem with driver loading after EndOfDxe, but it still has security flaws (2.A, 2.B, 3.A, 3.B, 5.A, and 5.B), which we want to resolve in the final solution.

As for the "what we have" table, it is kind of difficult, because I observe very different results from different platforms. OEM Vendors made a literal mess of what exists in EDK II, and the results vary strongly even on similar boards on the same chipset (e.g. Z390). Virtual machines behave differently to physical hardware as well. I am not sure I can build a good table of what happens now, maybe you can do that?
Comment 41 Marvin Häuser 2020-04-04 06:31:52 UTC
Jiewen,

Thanks for your reply.

I wanted to quickly confirm the clarification so there is no confusion. "What we have" was badly ambiguous and meant "what we based on", i.e. not what we propose. Sorry for the bad wording.
We based on the fixed state regarding RT driver loading after EndOfDxe (i.e. without both the RWX regression and the original skipping of proper RT driver inclusion in MAT after EndOfDxe), refined by your previous suggestions. This is the intermediate result for the first problem, loading RT drivers after EndOfDxe.
We believe the result can be further tweaked to increase security, that is our current concern and the second, more or less logically separate problem. However, as fixing SMM security is also an implication of the bug fix, we really consider both as one issue with two aspects, which are to be fixed in one go.

Therefore, after we fix RT driver loading in MAT, we will have the intermediate table[1]. To then resolve the security issues, the current concern, we will have the following table (please note it does not yet reflect the implementation of SMM notification we choose as we have not come to a definite agreement yet):

1. Memory allocation before EndOfDxe:

+-----------+---------+---------+------------+----------+-----------+--------+
+MemTypeInfo|   Type  |   MMAP  | MAT (Snap) | MAT(Fin) | SMM Paging| VMM/OS |
+-----------+---------+---------+------------+----------+-----------+--------+
A.Irrelevant| RT code | RT code | RT code(4) |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+------------+----------+-----------+--------+
B.Irrelevant| RT data | RT data | RT data(4) |  RT data |   XP (1)  |   XP   |
+-----------+---------+---------+------------+----------+-----------+--------+

2. Memory allocation after EndOfDxe:

+-----------+---------+---------+--------------+----------+-----------+--------+
+MemTypeInfo|   Type  |   MMAP  |  MAT (Snap)  | MAT(Fin) | SMM Paging| VMM/OS |
+-----------+---------+---------+--------------+----------+-----------+--------+
A.   TRUE   | RT code | RT code |  RT data(4)  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+--------------+----------+-----------+--------+
B.   TRUE   | RT data | RT data |  RT data(4)  |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+--------------+----------+-----------+--------+
C.  FALSE   | RT code | RT code |    Convent.  |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+--------------+----------+-----------+--------+
D.  FALSE   | RT data | RT data |    Convent.  |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+--------------+----------+-----------+--------+

3. Unused BIN types:

+-----------+---------+---------+------------+----------+-----------+--------+
+MemTypeInfo|   Type  |   MMAP  | MAT (Snap) | MAT(Fin) | SMM Paging| VMM/OS |
+-----------+---------+---------+------------+----------+-----------+--------+
A.   TRUE   | RT code | RT code | RT data(4) |  RT data |   Unmap   | RO+XP  |
+-----------+---------+---------+------------+----------+-----------+--------+
B.   TRUE   | RT data | RT data | RT data(4) |  RT data |   Unmap   | RO+XP  |
+-----------+---------+---------+------------+----------+-----------+--------+

4. Memory allocated for PE runtime driver before EndOfDxe:

+-----------+---------+---------+------------+----------+-----------+--------+
+MemTypeInfo| Section |  MMAP   | MAT (Snap) | MAT(Fin) | SMM Paging| VMM/OS |
+-----------+---------+---------+------------+----------+-----------+--------+
A.Irrelevant| .text   | RT code | RT code(4) |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+------------+----------+-----------+--------+
B.Irrelevant| .data   | RT code | RT data(4) |  RT data |   XP (2)  |   XP   |
+-----------+---------+---------+------------+----------+-----------+--------+

5. Memory allocated for PE runtime driver after EndOfDxe:

+-----------+---------+---------+---------------------+----------+-----------+--------+
+MemTypeInfo| Section |   MMAP  |      MAT(Snap)      | MAT(Fin) | SMM Paging| VMM/OS |
+-----------+---------+---------+---------------------+----------+-----------+--------+
A.   TRUE   | .text   | RT code | RT data/Convent.(3) |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+---------------------+----------+-----------+--------+
B.   TRUE   | .data   | RT code | RT data/Convent.(3) |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+---------------------+----------+-----------+--------+
C.  FALSE   | .text   | RT code |       Convent.      |  RT code |   Unmap   |   RO   |
+-----------+---------+---------+---------------------+----------+-----------+--------+
D.  FALSE   | .data   | RT code |       Convent.      |  RT data |   Unmap   |   XP   |
+-----------+---------+---------+---------------------+----------+-----------+--------+

Please note that "Conventional" here effectively means "not present".

(1) Can be unmapped if we decide that SMM communication buffers must not be RT data at all.
(2) Can be unmapped if we decide that global variables cannot be used as SMM communication buffers. This is implied by (1).
(3) The effective type depends on whether the image could be fit into an unused BIN (RT data) or not (Conventional). Implies (4).
(4) May become Conventional if we go with route 1, dropping one or multiple out of a) unused BIN regions, b) RT code, c) PE .data, and d) RT data from MAT entirely up until EndOfDxe. Depends on (1) and (2).

Please let us know which issues you see with route 1, dropping from MAT, as indicated by (4).

[1] https://bugzilla.tianocore.org/show_bug.cgi?id=2643#c36

Best regards,
Marvin
Comment 42 jiewen.yao 2020-04-04 20:30:53 UTC
Hello
I feel above tables are too complicated.
I try to merge them into one table as below.


1. Here is the summary of current status I have observed :

+------------+----------+---------+---------+--------+--------+--------+
+    Type    |MMAP(Snap)|MMAP(Fin)|MAT(Snap)|MAT(Fin)|   SMM  |   OS   |
+------------+----------+---------+---------+--------+--------+--------+
A. PE .text  | RT code  | RT code | RT code |RT code |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+--------+
B. PE .data  | RT code  | RT code | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
C. RT Data   | RT data  | RT data | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
D.Unused Code| RT code  | RT code | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
E.Unused Data| RT data  | RT data | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
F. Late .text| RT code  | RT code | RT data |RT data |   XP*  |   XP*  |
+------------+----------+---------+---------+--------+--------+--------+
G. Late .data| RT code  | RT code | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
H. Late Data | RT data  | RT data | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
I. Late .text| Convent  | RT code |   N/A   |RT code |RP+WP+XP|  ---*  |
+------------+----------+---------+---------+--------+--------+--------+
J. Late .data| Convent  | RT code |   N/A   |RT code |RP+WP+XP|  ---*  |
+------------+----------+---------+---------+--------+--------+--------+
K. Late Data | Convent  | RT data |   N/A   |RT data |RP+WP+XP|   XP   |
+------------+----------+---------+---------+--------+--------+--------+

New late code case impacts F, G, I, J.

Issue we found: (with * in the table)
  F.SMM == XP : SMM can access the late .text
  F.OS == XP  :  OS cannot run the late .text
  I.OS == --- :  XP/WP violation
  J.OS == --- :  XP/WP violation

Please be aware that we have madatory rule below:
Because SMM only refer to snapshort:
  R1: D.SMM == F.SMM == G.SMM
  R2: E.SMM == H.SMM
Because SMM cannot modify OS code:
  R3: if X.OS == WP, then X.SMM == RP+WP+XP

2. Below is my thought and proposal in short term:

To fix the issue, we need 
  F.OS == WP
  G.OS == XP
  I.OS == WP
  J.OS == XP

According to R3, we have
  F.SMM == RP+WP+XP
  I.SMM == RP+WP+XP

According to R1, we have
  D.SMM == RP+WP+XP
  G.SMM == RP+WP+XP

Summary of my proposal below:
+------------+----------+---------+---------+--------+--------+--------+
+    Type    |MMAP(Snap)|MMAP(Fin)|MAT(Snap)|MAT(Fin)|   SMM  |   OS   |
+------------+----------+---------+---------+--------+--------+--------+
A. PE .text  | RT code  | RT code | RT code |RT code |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+--------+
B. PE .data  | RT code  | RT code | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
C. RT Data   | RT data  | RT data | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
D.Unused Code| RT code  | RT code | RT code |RT data |RP+WP+XP|  WP+XP |
+------------+----------+---------+---------+--------+--------+--------+
E.Unused Data| RT data  | RT data | RT data |RT data |   XP   |  WP+XP |
+------------+----------+---------+---------+--------+--------+--------+
F. Late .text| RT code  | RT code | RT code |RT code |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+--------+
G. Late .data| RT code  | RT code | RT code |RT data |RP+WP+XP|   XP   |
+------------+----------+---------+---------+--------+--------+--------+
H. Late Data | RT data  | RT data | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
I. Late .text| Convent  | RT code |   N/A   |RT code |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+--------+
J. Late .data| Convent  | RT code |   N/A   |RT data |RP+WP+XP|   XP   |
+------------+----------+---------+---------+--------+--------+--------+
K. Late Data | Convent  | RT data |   N/A   |RT data |RP+WP+XP|   XP   |
+------------+----------+---------+---------+--------+--------+--------+

3. Future items:

3.1. For SMM, we mark .data section to be RP+WP+XP.
  B.SMM = RP+WP+XP.
This is agreed to eliminate the compatibility issue.

3.2. For SMM, we mark unused mem to be RP+WP+XP.
  E.SMM = H.SMM = RP+WP+XP.
This is agreed to eliminate the compatibility issue.

3.3. For OS, we split .data section into .data and .rdata.
  B.OS.data == XP
  B.OS.rdata == XP+WP
  G.OS.data == XP
  G.OS.rdata == XP+WP
  J.OS.data == XP
  J.OS.rdata == XP+WP
This is agreed to eliminate the compatibility issue.
Comment 43 Vitaly Cheptsov 2020-04-05 05:38:54 UTC
Jiewen,

The new tables are helpful, we mostly agree with you.

Could you confirm that the rulesets represent only the current state of the art? I.e. the immediate fix is not supposed to change this, but in the future R1 and R2 will likely be lifted as we make SMM aware of unused BINs / PE .data.

Other than that there are two undiscussed issues to be resolved in the short-term fix:


I. 1.D is correct for non-borked Intel but incorrect for VMware/OVMF:
+------------+----------+---------+---------+--------+--------+--------+
D.Unused Code| RT code  | RT code | RT code |RT code |RP+WP+XP|  ---*  |
+------------+----------+---------+---------+--------+--------+--------+

Therefore we need to update the fix list with:
  D.OS = XP

I mentioned "non-borked" Intel, because in the wild we were able to find Gigabyte Z390 boards, which just corrupt MAT upon late RT driver loading, producing MAT with duplicated entries and some other entries missing. Unfortunately this is not the only example.

II. RT code is not in the table, but we can load executable RT code ourselves past EndOfDxe:
+------------+----------+---------+---------+--------+--------+--------+
+    Type    |MMAP(Snap)|MMAP(Fin)|MAT(Snap)|MAT(Fin)|   SMM  |   OS   |
+------------+----------+---------+---------+--------+--------+--------+
L. RT Code   | RT code  | RT code | RT code |RT code |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+--------+
M. Late Code | RT code  | RT code | RT data |RT data*|   XP*  |   XP*  |
+------------+----------+---------+---------+--------+--------+--------+
N. Late Code | Convent  | RT code |   N/A   |RT code |RP+WP+XP|  ---*  |
+------------+----------+---------+---------+--------+--------+--------+

Therefore we need to update the fix list with:
  M.OS = RO
  N.OS = RO

As per your ruleset:
  M.MAT(Fin) = RT code
  M.SMM = RP+WP+XP


Regarding future work, it is definitely needed and the exact implementation detail will vary. We agree that the changes you mentioned will be needed in some way, and I believe Marvin outlined a very good list of the possible ways for the implementation to follow. However, for the time being I want to postpone discussing the future as we resolve the immediate and more serious problem first.

Best regards,
Vitaly
Comment 44 jiewen.yao 2020-04-05 22:43:42 UTC
comment inline:

The new tables are helpful, we mostly agree with you.
[Jiewen] Thank you!

Could you confirm that the rulesets represent only the current state of the art? I.e. the immediate fix is not supposed to change this, but in the future R1 and R2 will likely be lifted as we make SMM aware of unused BINs / PE .data.

[Jiewen] Yes. Rule sets are for current code without fix and the short term fix I proposed in the second table.
It may be lifted later if we discuss and agree on a more strict rule.



I. 1.D is correct for non-borked Intel but incorrect for VMware/OVMF:
+------------+----------+---------+---------+--------+--------+--------+
D.Unused Code| RT code  | RT code | RT code |RT code |RP+WP+XP|  ---*  |
+------------+----------+---------+---------+--------+--------+--------+
Therefore we need to update the fix list with:
  D.OS = XP
I mentioned "non-borked" Intel, because in the wild we were able to find Gigabyte Z390 boards, which just corrupt MAT upon late RT driver loading, producing MAT with duplicated entries and some other entries missing. Unfortunately this is not the only example.

[Jiewen] First, I would like to clarify, the unused code means the OS final result. If you load a RT driver later, we need refer to F~K.
Second, I am surprised that you can find duplicated entry and missing entry. Is that reproducible on OVMF ? If yes, I think we need fix.



II. RT code is not in the table, but we can load executable RT code ourselves past EndOfDxe:
+------------+----------+---------+---------+--------+--------+--------+
+    Type    |MMAP(Snap)|MMAP(Fin)|MAT(Snap)|MAT(Fin)|   SMM  |   OS   |
+------------+----------+---------+---------+--------+--------+--------+
L. RT Code   | RT code  | RT code | RT code |RT code |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+--------+
M. Late Code | RT code  | RT code | RT data |RT data*|   XP*  |   XP*  |
+------------+----------+---------+---------+--------+--------+--------+
N. Late Code | Convent  | RT code |   N/A   |RT code |RP+WP+XP|  ---*  |
+------------+----------+---------+---------+--------+--------+--------+
Therefore we need to update the fix list with:
  M.OS = RO
  N.OS = RO
As per your ruleset:
  M.MAT(Fin) = RT code
  M.SMM = RP+WP+XP
[Jiewen] I think I split the RT Code into PE .text and PE .data. That is why i remove RT code entry. Is there any other case not covered by PE .text and PE .data. Or is there any misunderstanding?
Here I mean RO == WP. (Read-Only == Write Protected)



Regarding future work, it is definitely needed and the exact implementation detail will vary. We agree that the changes you mentioned will be needed in some way, and I believe Marvin outlined a very good list of the possible ways for the implementation to follow. However, for the time being I want to postpone discussing the future as we resolve the immediate and more serious problem first.
[Jiewen] Sounds good.
I dont think we have enough time for 202005 stable.
We had better discuss after this short term fix is done, to see if and how we lift the rule.
Comment 45 Vitaly Cheptsov 2020-04-05 23:11:21 UTC
> [Jiewen] First, I would like to clarify, the unused code means the OS final result. If you load a RT driver later, we need refer to F~K.

Yes, no mistake here, OVMF for some reason does not mark unused RT code BIN as write-protected. I.e. unused RT code BIN is RWX in the final OS-visible MAT. Not sure why, this is not how the real hardware behaves. Perhaps OVMF support for BINs is incomplete or flawed.

> Second, I am surprised that you can find duplicated entry and missing entry. Is that reproducible on OVMF ? If yes, I think we need fix.

For the good nothing like this exists in OVMF. We suspect that this vendor (GIGABYTE or AMI) made some weird custom patches to their private EDK II tree. I do not think it is worth exploring, but just in case you want to have a look on how bad it can be: Z390 MAT dump before loading RT drver[1] and MAT dump after loading RT driver at 3EE6A000h via normal LoadImage[2].

> [Jiiewen] I think I split the RT Code into PE .text and PE .data. That is why i remove RT code entry. Is there any other case not covered by PE .text and PE .data. > Or is there any misunderstanding?
> Here I mean RO == WP. (Read-Only == Write Protected)

I am talking of the case when something after EndOfDxe allocates RT code memory directly and copies some generated machine code to it. This copied code will need to be executable by the OS, but can indeed by read-only. I do not think it is covered anywhere at the moment.

[1] https://github.com/acidanthera/bugtracker/files/4431231/log.txt
[2] https://github.com/acidanthera/bugtracker/files/4431233/log2.txt
Comment 46 jiewen.yao 2020-04-05 23:37:04 UTC
Thanks for the clarification.
You are right. I missed the arbitrary RT code - good to discuss here.

I naturally did not take the arbitrary code into account, because we dont have a way to verify the integrity of the code, so we just ignore it. When we do the PE scan, this region is marked as XP, similar to PE .data section in both current implementation and short term fix proposal. (Same to B/G/J)

You are the first people that tell us this capability is needed.

In current implementation (append to table 1):
+------------+----------+---------+---------+--------+--------+--------+
+    Type    |MMAP(Snap)|MMAP(Fin)|MAT(Snap)|MAT(Fin)|   SMM  |   OS   |
+------------+----------+---------+---------+--------+--------+--------+
L. RT Code   | RT code  | RT code | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
M. Late Code | RT code  | RT code | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
N. Late Code | Convent  | RT code |   N/A   |RT code |RP+WP+XP|  ---*  |
+------------+----------+---------+---------+--------+--------+--------+

My proposal in the short term fix (append to table 2)
+------------+----------+---------+---------+--------+--------+--------+
+    Type    |MMAP(Snap)|MMAP(Fin)|MAT(Snap)|MAT(Fin)|   SMM  |   OS   |
+------------+----------+---------+---------+--------+--------+--------+
L. RT Code   | RT code  | RT code | RT data |RT data |   XP   |   XP   |
+------------+----------+---------+---------+--------+--------+--------+
M. Late Code | RT code  | RT code | RT code |RT data |RP+WP+XP|   XP   |
+------------+----------+---------+---------+--------+--------+--------+
N. Late Code | Convent  | RT code |   N/A   |RT data |RP+WP+XP|   XP   |
+------------+----------+---------+---------+--------+--------+--------+

I agree with you that we had better add this entry to clarify the behavior.

Personally, I am worried about the security concern for those arbitrary code execution capability, if you want to grant execution capability.
I would like to hear the opinion from infosec team.

Is that something you need in the short term fix, or we can discuss that in long term solution ?
Comment 47 Vitaly Cheptsov 2020-04-06 03:59:18 UTC
Jiewen,

The suggested approach will not do for the short-term fix, as we indeed load custom runtime code (I think I mentioned it before, when I said that we plan to upstream a new PE loader to EDK II). The reason we need to load it manually is because it is signed differently and can be in a different format from PE.

There is no security risk for doing this, as the user (i.e. bootloader) can update user's MAT on its own in any case. As long as for SMM this code remains not accessible, there is no risk in marking it as executable and read only for the OS.

We do not need to write to allocated RT code at runtime, so we do not need any kind of .text/.data splitting like PE does normally. Thus the newly added entries after a short-term fix can just be as follows:

+------------+----------+---------+---------+--------+--------+--------+
+    Type    |MMAP(Snap)|MMAP(Fin)|MAT(Snap)|MAT(Fin)|   SMM  |   OS   |
+------------+----------+---------+---------+--------+--------+--------+
L. RT Code   | RT code  | RT code | RT data |RT data |   XP   |   WP   |
+------------+----------+---------+---------+--------+--------+--------+
M. Late Code | RT code  | RT code | RT code |RT data |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+--------+
N. Late Code | Convent  | RT code |   N/A   |RT data |RP+WP+XP|   WP   |
+------------+----------+---------+---------+--------+--------+------—+

As for no requests about fixing it, I am afraid nobody noticed so far, and we are roughly the first people to face it. Currently almost all firmwares just disable MAT as long as they see anybody attempting to load an unaligned RT driver, and I do not think many people even know about MAT or that PE runtime drivers built with EDK II have broken alignment by default.

Best regards,
Vitaly
Comment 48 Vincent Zimmer 2020-05-06 11:27:11 UTC
from 4/1/2020 infosec meeting:

Bret B. from MS to take follow-up investigation, esp given comment 20 fragment "I prefer we get feedback from Microsoft to see if VBS is happy with that."

keep within infosec
Comment 49 Vincent Zimmer 2020-05-06 12:31:21 UTC
5/6/2020 infosec:

Bret still working the discussion through MS
Comment 50 Vincent Zimmer 2020-06-15 18:20:06 UTC
from last meeting https://edk2.groups.io/g/infosec/topic/infosec_meeting_minutes_june/74654606?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,74654606 we didn't have opportunity to discuss w/ Bret @ MS. Await MS feedback.
Comment 51 Vincent Zimmer 2020-06-15 18:20:52 UTC
from last meeting https://edk2.groups.io/g/infosec/topic/infosec_meeting_minutes_june/74654606?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,74654606 we didn't have opportunity to discuss w/ Bret @ MS. Await MS feedback.
Comment 52 John Mathews 2020-07-01 12:23:00 UTC
Still need feedback on this from Bret @ MS.
We didn't get attendance in the 7/1 Infosec mtg.
Comment 53 John Mathews 2020-07-27 14:40:04 UTC
If no community concerns raised by August 5th Infosec mtg, planning to open this to public BZ and treat as normal bug.
Comment 54 Eric Johnson 2020-09-02 13:00:32 UTC
(In reply to John Mathews from comment #53)
> If no community concerns raised by August 5th Infosec mtg, planning to open
> this to public BZ and treat as normal bug.

The current plan is to open this issue up on 7 October unless further feedback is provided.
Comment 55 John Mathews 2020-11-04 13:02:33 UTC
Moving this to public, per discussion in the 11/4 Infosec meeting.
Comment 56 Rebecca Cran 2023-03-12 12:21:35 UTC
I'm not sure if this is related, but I came across this ticket because gcc 12 complains when building BHYVE:

/usr/local/bin/ld: warning: InitializeFpu.obj: missing .note.GNU-stack section implies executable stack
/usr/local/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/local/bin/ld: warning: /home/bcran/src/uefi/edk2/Build/BhyveX64/RELEASE_GCC5/X64/OvmfPkg/AmdSevDxe/AmdSevDxe/DEBUG/AmdSevDxe.dll has a LOAD segment with RWX permissions