Mapping MMIO region write-back does not work

Multi tool use

I want all read & write requests to a PCIe device to be cached by CPU caches. However, it does not work as I expected.

These are my assumptions on write-back MMIO regions.

Writes to the PCIe device happen only on cache write-back.

The size of TLP payloads is cache block size (64B).

However, captured TLPs do not follow my assumptions.

Writes to the PCIe device happen on every write to the MMIO region.

The size of TLP payloads is 1B.

I write 8-byte of 0xff to the MMIO region with the following user space program & device driver.

Part of User Program

struct pcie_ioctl ioctl_control;
ioctl_control.bar_select = BAR_ID;
ioctl_control.num_bytes_to_write = atoi(argv[1]);
if (ioctl(fd, IOCTL_WRITE_0xFF, &ioctl_control) < 0) 
 printf("ioctl failedn");

Part of Device Driver

case IOCTL_WRITE_0xFF:

 int i;
 char *buff;
 struct pci_cdev_struct *pci_cdev = pci_get_drvdata(fpga_pcie_dev.pci_device);
 copy_from_user(&ioctl_control, (void __user *)arg, sizeof(ioctl_control));
 buff = kmalloc(sizeof(char) * ioctl_control.num_bytes_to_write, GFP_KERNEL);
 for (i = 0; i < ioctl_control.num_bytes_to_write; i++) 
 buff[i] = 0xff;
 
 memcpy(pci_cdev->bar[ioctl_control.bar_select], buff, ioctl_control.num_bytes_to_write);
 kfree(buff);
 break;

I modified MTRRs to make the corresponding MMIO region write-back. The MMIO region starts from 0x0c7300000, and the length is 0x100000 (1MB). Followings are cat /proc/mtrr results for different policies. Please note that I made each region exclusive.

Uncacheable

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: uncachable
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-combining

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-combining
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-back

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-back
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Followings are waveform captures for 8B write with different policies. I have used integrated logic analyzer (ILA) to capture these waveform. Please watch pcie_endpoint_litepcietlpdepacketizer_tlp_req_payload_dat when pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid is set. You can count the number of packets by counting pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid in these waveform example.

Uncacheable: link -> correct, 1B x 8 packets

Write-combining: link -> correct, 8B x 1 packet

Write-back: link -> unexpected, 1B x 8 packets

System configuration is like below.

CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

OS: Linux kernel 4.15.0-38

PCIe Device: Xilinx FPGA KC705 programmed with litepcie

Related Links

Generating a 64-byte read PCIe TLP from an x86 CPU

How to Implement a 64B PCIe* Burst Transfer on Intel® Architecture

Write Combining Buffer Out of Order Writes and PCIe

Do Ryzen support write-back caching for Memory Mapped IO (through PCIe interface)?

MTRR (Memory Type Range Register) control

PATting Linux

Down to the TLP: How PCI express devices talk (Part I)

edited Nov 15 '18 at 2:03

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

1

Is it possible that something else (like PAT) is overriding the MTRR setting and making it actually be UC instead of WB? An BTW, the single-byte transactions might be from mempcy being implemented as rep movsb inside the kernel (because your CPU is new enough for ERMSB which makes rep movsb fairly good.)

– Peter Cordes
Nov 15 '18 at 1:48

1

@PeterCordes Thanks, Peter. I could find a conflict in PAT. The region is set as uncached-minus in PAT. I will try to solve it.

– Taekyung Heo
Nov 15 '18 at 5:22

add a comment |

I want all read & write requests to a PCIe device to be cached by CPU caches. However, it does not work as I expected.

These are my assumptions on write-back MMIO regions.

Writes to the PCIe device happen only on cache write-back.

The size of TLP payloads is cache block size (64B).

However, captured TLPs do not follow my assumptions.

Writes to the PCIe device happen on every write to the MMIO region.

The size of TLP payloads is 1B.

I write 8-byte of 0xff to the MMIO region with the following user space program & device driver.

Part of User Program

struct pcie_ioctl ioctl_control;
ioctl_control.bar_select = BAR_ID;
ioctl_control.num_bytes_to_write = atoi(argv[1]);
if (ioctl(fd, IOCTL_WRITE_0xFF, &ioctl_control) < 0) 
 printf("ioctl failedn");

Part of Device Driver

case IOCTL_WRITE_0xFF:

 int i;
 char *buff;
 struct pci_cdev_struct *pci_cdev = pci_get_drvdata(fpga_pcie_dev.pci_device);
 copy_from_user(&ioctl_control, (void __user *)arg, sizeof(ioctl_control));
 buff = kmalloc(sizeof(char) * ioctl_control.num_bytes_to_write, GFP_KERNEL);
 for (i = 0; i < ioctl_control.num_bytes_to_write; i++) 
 buff[i] = 0xff;
 
 memcpy(pci_cdev->bar[ioctl_control.bar_select], buff, ioctl_control.num_bytes_to_write);
 kfree(buff);
 break;

Uncacheable

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: uncachable
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-combining

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-combining
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-back

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-back
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Uncacheable: link -> correct, 1B x 8 packets

Write-combining: link -> correct, 8B x 1 packet

Write-back: link -> unexpected, 1B x 8 packets

System configuration is like below.

CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

OS: Linux kernel 4.15.0-38

PCIe Device: Xilinx FPGA KC705 programmed with litepcie

Related Links

Generating a 64-byte read PCIe TLP from an x86 CPU

How to Implement a 64B PCIe* Burst Transfer on Intel® Architecture

Write Combining Buffer Out of Order Writes and PCIe

Do Ryzen support write-back caching for Memory Mapped IO (through PCIe interface)?

MTRR (Memory Type Range Register) control

PATting Linux

Down to the TLP: How PCI express devices talk (Part I)

edited Nov 15 '18 at 2:03

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

1

Is it possible that something else (like PAT) is overriding the MTRR setting and making it actually be UC instead of WB? An BTW, the single-byte transactions might be from mempcy being implemented as rep movsb inside the kernel (because your CPU is new enough for ERMSB which makes rep movsb fairly good.)

– Peter Cordes
Nov 15 '18 at 1:48

1

@PeterCordes Thanks, Peter. I could find a conflict in PAT. The region is set as uncached-minus in PAT. I will try to solve it.

– Taekyung Heo
Nov 15 '18 at 5:22

add a comment |

I want all read & write requests to a PCIe device to be cached by CPU caches. However, it does not work as I expected.

These are my assumptions on write-back MMIO regions.

Writes to the PCIe device happen only on cache write-back.

The size of TLP payloads is cache block size (64B).

However, captured TLPs do not follow my assumptions.

Writes to the PCIe device happen on every write to the MMIO region.

The size of TLP payloads is 1B.

I write 8-byte of 0xff to the MMIO region with the following user space program & device driver.

Part of User Program

struct pcie_ioctl ioctl_control;
ioctl_control.bar_select = BAR_ID;
ioctl_control.num_bytes_to_write = atoi(argv[1]);
if (ioctl(fd, IOCTL_WRITE_0xFF, &ioctl_control) < 0) 
 printf("ioctl failedn");

Part of Device Driver

case IOCTL_WRITE_0xFF:

 int i;
 char *buff;
 struct pci_cdev_struct *pci_cdev = pci_get_drvdata(fpga_pcie_dev.pci_device);
 copy_from_user(&ioctl_control, (void __user *)arg, sizeof(ioctl_control));
 buff = kmalloc(sizeof(char) * ioctl_control.num_bytes_to_write, GFP_KERNEL);
 for (i = 0; i < ioctl_control.num_bytes_to_write; i++) 
 buff[i] = 0xff;
 
 memcpy(pci_cdev->bar[ioctl_control.bar_select], buff, ioctl_control.num_bytes_to_write);
 kfree(buff);
 break;

Uncacheable

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: uncachable
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-combining

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-combining
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-back

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-back
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Uncacheable: link -> correct, 1B x 8 packets

Write-combining: link -> correct, 8B x 1 packet

Write-back: link -> unexpected, 1B x 8 packets

System configuration is like below.

CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

OS: Linux kernel 4.15.0-38

PCIe Device: Xilinx FPGA KC705 programmed with litepcie

Related Links

Generating a 64-byte read PCIe TLP from an x86 CPU

How to Implement a 64B PCIe* Burst Transfer on Intel® Architecture

Write Combining Buffer Out of Order Writes and PCIe

Do Ryzen support write-back caching for Memory Mapped IO (through PCIe interface)?

MTRR (Memory Type Range Register) control

PATting Linux

Down to the TLP: How PCI express devices talk (Part I)

edited Nov 15 '18 at 2:03

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

I want all read & write requests to a PCIe device to be cached by CPU caches. However, it does not work as I expected.

These are my assumptions on write-back MMIO regions.

Writes to the PCIe device happen only on cache write-back.

The size of TLP payloads is cache block size (64B).

However, captured TLPs do not follow my assumptions.

Writes to the PCIe device happen on every write to the MMIO region.

The size of TLP payloads is 1B.

I write 8-byte of 0xff to the MMIO region with the following user space program & device driver.

Part of User Program

struct pcie_ioctl ioctl_control;
ioctl_control.bar_select = BAR_ID;
ioctl_control.num_bytes_to_write = atoi(argv[1]);
if (ioctl(fd, IOCTL_WRITE_0xFF, &ioctl_control) < 0) 
 printf("ioctl failedn");

Part of Device Driver

case IOCTL_WRITE_0xFF:

 int i;
 char *buff;
 struct pci_cdev_struct *pci_cdev = pci_get_drvdata(fpga_pcie_dev.pci_device);
 copy_from_user(&ioctl_control, (void __user *)arg, sizeof(ioctl_control));
 buff = kmalloc(sizeof(char) * ioctl_control.num_bytes_to_write, GFP_KERNEL);
 for (i = 0; i < ioctl_control.num_bytes_to_write; i++) 
 buff[i] = 0xff;
 
 memcpy(pci_cdev->bar[ioctl_control.bar_select], buff, ioctl_control.num_bytes_to_write);
 kfree(buff);
 break;

Uncacheable

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: uncachable
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-combining

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-combining
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Write-back

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-back
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable

Uncacheable: link -> correct, 1B x 8 packets

Write-combining: link -> correct, 8B x 1 packet

Write-back: link -> unexpected, 1B x 8 packets

System configuration is like below.

CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

OS: Linux kernel 4.15.0-38

PCIe Device: Xilinx FPGA KC705 programmed with litepcie

Related Links

Generating a 64-byte read PCIe TLP from an x86 CPU

How to Implement a 64B PCIe* Burst Transfer on Intel® Architecture

Write Combining Buffer Out of Order Writes and PCIe

Do Ryzen support write-back caching for Memory Mapped IO (through PCIe interface)?

MTRR (Memory Type Range Register) control

PATting Linux

Down to the TLP: How PCI express devices talk (Part I)

linux caching x86 fpga pci-e

edited Nov 15 '18 at 2:03

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

edited Nov 15 '18 at 2:03

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

edited Nov 15 '18 at 2:03

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

asked Nov 15 '18 at 1:21

Taekyung Heo

3115

1

Is it possible that something else (like PAT) is overriding the MTRR setting and making it actually be UC instead of WB? An BTW, the single-byte transactions might be from mempcy being implemented as rep movsb inside the kernel (because your CPU is new enough for ERMSB which makes rep movsb fairly good.)

– Peter Cordes
Nov 15 '18 at 1:48

1

@PeterCordes Thanks, Peter. I could find a conflict in PAT. The region is set as uncached-minus in PAT. I will try to solve it.

– Taekyung Heo
Nov 15 '18 at 5:22

add a comment |

1

Is it possible that something else (like PAT) is overriding the MTRR setting and making it actually be UC instead of WB? An BTW, the single-byte transactions might be from mempcy being implemented as rep movsb inside the kernel (because your CPU is new enough for ERMSB which makes rep movsb fairly good.)

– Peter Cordes
Nov 15 '18 at 1:48

1

@PeterCordes Thanks, Peter. I could find a conflict in PAT. The region is set as uncached-minus in PAT. I will try to solve it.

– Taekyung Heo
Nov 15 '18 at 5:22

Is it possible that something else (like PAT) is overriding the MTRR setting and making it actually be UC instead of WB? An BTW, the single-byte transactions might be from mempcy being implemented as rep movsb inside the kernel (because your CPU is new enough for ERMSB which makes rep movsb fairly good.)

– Peter Cordes
Nov 15 '18 at 1:48

@PeterCordes Thanks, Peter. I could find a conflict in PAT. The region is set as uncached-minus in PAT. I will try to solve it.

– Taekyung Heo
Nov 15 '18 at 5:22

add a comment |

1 Answer
1

active

oldest

votes

In short, it seems that mapping MMIO region write-back does not work by design.

Please upload an answer if anyone finds that it is possible.

I came to find John McCalpin's articles and answers. First, mapping MMIO region write-back is not possible. Second, workaround is possible on some processors.

Mapping MMIO region write-back is not possible

Quote from this link

FYI: The WB type will not work with memory-mapped IO. You can
program the bits to set up the mapping as WB, but the system will
crash as soon as it gets a transaction that it does not know how to
handle. It is theoretically possible to use WP or WT to get cached
reads from MMIO, but coherence has to be handled in software.

Quote from this link

Only when I set both PAT and MTRR to WB does the kernel crash

Workaround is possible on some processors

Notes on Cached Access to Memory-Mapped IO Regions, John McCalpin

There is one set of mappings that can be made to work on at least some
x86-64 processors, and it is based on mapping the MMIO space twice.
Map the MMIO range with a set of attributes that allow write-combining
stores (but only uncached reads). Map the MMIO range a second time
with a set of attributes that allow cache-line reads (but only
uncached, non-write-combined stores).

edited Nov 15 '18 at 11:24

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

1

I wonder if anything changes with Skylake-SP where 64-byte stores are possible to UC MMIO regions with a single AVX512 instruction. But probably cache eviction is still different from a movntps [rdi], zmm0, or movaps on a UC memory region.

– Peter Cordes
Nov 15 '18 at 11:06

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53311131%2fmapping-mmio-region-write-back-does-not-work%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

In short, it seems that mapping MMIO region write-back does not work by design.

Please upload an answer if anyone finds that it is possible.

I came to find John McCalpin's articles and answers. First, mapping MMIO region write-back is not possible. Second, workaround is possible on some processors.

Mapping MMIO region write-back is not possible

Quote from this link

FYI: The WB type will not work with memory-mapped IO. You can
program the bits to set up the mapping as WB, but the system will
crash as soon as it gets a transaction that it does not know how to
handle. It is theoretically possible to use WP or WT to get cached
reads from MMIO, but coherence has to be handled in software.

Quote from this link

Only when I set both PAT and MTRR to WB does the kernel crash

Workaround is possible on some processors

Notes on Cached Access to Memory-Mapped IO Regions, John McCalpin

There is one set of mappings that can be made to work on at least some
x86-64 processors, and it is based on mapping the MMIO space twice.
Map the MMIO range with a set of attributes that allow write-combining
stores (but only uncached reads). Map the MMIO range a second time
with a set of attributes that allow cache-line reads (but only
uncached, non-write-combined stores).

edited Nov 15 '18 at 11:24

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

1

I wonder if anything changes with Skylake-SP where 64-byte stores are possible to UC MMIO regions with a single AVX512 instruction. But probably cache eviction is still different from a movntps [rdi], zmm0, or movaps on a UC memory region.

– Peter Cordes
Nov 15 '18 at 11:06

add a comment |

In short, it seems that mapping MMIO region write-back does not work by design.

Please upload an answer if anyone finds that it is possible.

I came to find John McCalpin's articles and answers. First, mapping MMIO region write-back is not possible. Second, workaround is possible on some processors.

Mapping MMIO region write-back is not possible

Quote from this link

FYI: The WB type will not work with memory-mapped IO. You can
program the bits to set up the mapping as WB, but the system will
crash as soon as it gets a transaction that it does not know how to
handle. It is theoretically possible to use WP or WT to get cached
reads from MMIO, but coherence has to be handled in software.

Quote from this link

Only when I set both PAT and MTRR to WB does the kernel crash

Workaround is possible on some processors

Notes on Cached Access to Memory-Mapped IO Regions, John McCalpin

There is one set of mappings that can be made to work on at least some
x86-64 processors, and it is based on mapping the MMIO space twice.
Map the MMIO range with a set of attributes that allow write-combining
stores (but only uncached reads). Map the MMIO range a second time
with a set of attributes that allow cache-line reads (but only
uncached, non-write-combined stores).

edited Nov 15 '18 at 11:24

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

1

I wonder if anything changes with Skylake-SP where 64-byte stores are possible to UC MMIO regions with a single AVX512 instruction. But probably cache eviction is still different from a movntps [rdi], zmm0, or movaps on a UC memory region.

– Peter Cordes
Nov 15 '18 at 11:06

add a comment |

In short, it seems that mapping MMIO region write-back does not work by design.

Please upload an answer if anyone finds that it is possible.

I came to find John McCalpin's articles and answers. First, mapping MMIO region write-back is not possible. Second, workaround is possible on some processors.

Mapping MMIO region write-back is not possible

Quote from this link

FYI: The WB type will not work with memory-mapped IO. You can
program the bits to set up the mapping as WB, but the system will
crash as soon as it gets a transaction that it does not know how to
handle. It is theoretically possible to use WP or WT to get cached
reads from MMIO, but coherence has to be handled in software.

Quote from this link

Only when I set both PAT and MTRR to WB does the kernel crash

Workaround is possible on some processors

Notes on Cached Access to Memory-Mapped IO Regions, John McCalpin

There is one set of mappings that can be made to work on at least some
x86-64 processors, and it is based on mapping the MMIO space twice.
Map the MMIO range with a set of attributes that allow write-combining
stores (but only uncached reads). Map the MMIO range a second time
with a set of attributes that allow cache-line reads (but only
uncached, non-write-combined stores).

edited Nov 15 '18 at 11:24

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

In short, it seems that mapping MMIO region write-back does not work by design.

Please upload an answer if anyone finds that it is possible.

I came to find John McCalpin's articles and answers. First, mapping MMIO region write-back is not possible. Second, workaround is possible on some processors.

Mapping MMIO region write-back is not possible

Quote from this link

FYI: The WB type will not work with memory-mapped IO. You can
program the bits to set up the mapping as WB, but the system will
crash as soon as it gets a transaction that it does not know how to
handle. It is theoretically possible to use WP or WT to get cached
reads from MMIO, but coherence has to be handled in software.

Quote from this link

Only when I set both PAT and MTRR to WB does the kernel crash

Workaround is possible on some processors

Notes on Cached Access to Memory-Mapped IO Regions, John McCalpin

There is one set of mappings that can be made to work on at least some
x86-64 processors, and it is based on mapping the MMIO space twice.
Map the MMIO range with a set of attributes that allow write-combining
stores (but only uncached reads). Map the MMIO range a second time
with a set of attributes that allow cache-line reads (but only
uncached, non-write-combined stores).

edited Nov 15 '18 at 11:24

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

edited Nov 15 '18 at 11:24

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

answered Nov 15 '18 at 6:07

Taekyung Heo

3115

1

I wonder if anything changes with Skylake-SP where 64-byte stores are possible to UC MMIO regions with a single AVX512 instruction. But probably cache eviction is still different from a movntps [rdi], zmm0, or movaps on a UC memory region.

– Peter Cordes
Nov 15 '18 at 11:06

add a comment |

1

I wonder if anything changes with Skylake-SP where 64-byte stores are possible to UC MMIO regions with a single AVX512 instruction. But probably cache eviction is still different from a movntps [rdi], zmm0, or movaps on a UC memory region.

– Peter Cordes
Nov 15 '18 at 11:06

I wonder if anything changes with Skylake-SP where 64-byte stores are possible to UC MMIO regions with a single AVX512 instruction. But probably cache eviction is still different from a movntps [rdi], zmm0, or movaps on a UC memory region.

– Peter Cordes
Nov 15 '18 at 11:06

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

tTcsRxI4OAYblvZ21ld0L97dw,cDx4M 4G

搜尋此網誌

Odtnhj