Thread (20 messages) 20 messages, 4 authors, 2021-07-21

RE: About add an A64FX cache control function into resctrl

From: tan.shaopeng@fujitsu.com <hidden>
Date: 2021-07-16 00:57:07
Also in: lkml

Hi Reinette,
quoted
quoted
Sorry, I have not explained A64FX's sector cache function well yet.
I think I need explain this function from different perspective.
You have explained the A64FX's sector cache function well. I have also
read both specs to understand it better. It appears to me that you are
not considering the resctrl architecture as part of your solution but
instead just forcing your architecture onto the resctrl filesystem.
For example, in resctrl the resource groups are not just a directory
structure but has significance in what is being represented within the
directory (a class of service). The files within a resource group's
directory build on that. From your side I have not seen any effort in
aligning the sector cache function with the resctrl architecture but instead
you are just changing resctrl interface to match the A64FX architecture.
quoted
Could you please take a moment to understand what resctrl is and how
it could be mapped to A64FX in a coherent way?
Previously, my idea is based on how to make instructions use different sectors
in one task. After I studied resctrl, to utilize resctrl architecture on A64FX, I
think it’s better to assign one sector to one task. Thanks for your idea that
"sectors" could be considered the same as the resctrl "classes of service".

Based on your idea, I am considering the implementation details.
In this email, I will explain the outline of new proposal, and then please allow
me to confirm a few technologies about resctrl.
Could you give me some comments & advices?

Best regards,
Tan Shaopeng
The outline of my proposal is as follows.
- Add a sector function equivalent to Intel's CAT function into resctrl.
  (divide shared L2 cache into multiple partitions for multiple cores use)
- Allocate one sector to one resource group (one CLOSID). Since one
  core can only be assigned to one resource group, on A64FX each core
  only uses one sector at a time.
- Disable A64FX's HPC tag address override function. We only set each
  core's default sector value according to closid(default sector ID=CLOSID).
- No L1 cache control since L1 cache is not shared for cores. It is not
  necessary to add L1 cache interface for schemata file.
- No need to update schemata interface. Resctrl's L2 cache interface
  (L2: <cache_id0> = <cbm>; <cache_id1> = <cbm>; ...)
  will be used as it is. However, on A64FX, <cbm> does not indicate
  the position of cache partition, only indicate the number of
  cache ways (size).

This is the smallest start of incorporating sector cache function into resctrl. I
will consider if we could add more sector cache features into resctrl (e.g.
selecting different sectors from one task) after finishing this.

(some questions are below)
quoted
quoted
quoted
On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
quoted
--------
A64FX NUMA-PE-Cache Architecture:
NUMA0:
   PE0:
     L1sector0,L1sector1,L1sector2,L1sector3
   PE1:
     L1sector0,L1sector1,L1sector2,L1sector3
   ...
   PE11:
     L1sector0,L1sector1,L1sector2,L1sector3

   L2sector0,1/L2sector2,3
NUMA1:
   PE0:
     L1sector0,L1sector1,L1sector2,L1sector3
   ...
   PE11:
     L1sector0,L1sector1,L1sector2,L1sector3

   L2sector0,1/L2sector2,3
NUMA2:
   ...
NUMA3:
   ...
--------
In A64FX processor, one L1 sector cache capacity setting register is
only for one PE and not shared among PEs. L2 sector cache maximum
capacity setting registers are shared among PEs in same NUMA, and it
is to be noted that changing these registers in one PE influences other PE.
Understood. cache affinity is familiar to resctrl. When a CPU becomes
online it is discovered which caches/resources it has affinity to.
Resources then have CPU mask associated with them to indicate on which
CPU a register could be changed to configure the resource/cache. See
domain_add_cpu() and struct rdt_domain.
Is the following understanding correct?
Struct rdt_domain is a group of online CPUs that share a same cache instance.
When a CPU is online(resctrl initialization), the domain_add_cpu() function
add the online cpu to corresponding rdt_domain (in rdt_resource:domains list).
For example, if there are
4 L2 cache instances, then there will be 4 rdt_domain in the list and each CPU
is assigned to corresponding rdt_domain.

The set values of cache/memory are stored in the *ctrl_val array (indexed by
CLOSID) of struct rdt_domain. For example, in CAT function, the CBM value of
CLOSID=x is stored in ctrl_val [x].
When we create a resource group and write set values of cache into the
schemata file, the update_domains() function updates the CBM value to
ctrl_val [CLOSID = resource group ID] in rdt_domain and updates the CBM
value to CBM register(MSR_IA32_Lx_CBM_BASE).
quoted
quoted
The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
the same time in same NUMA.


I think, in your idea, a resource group will be created for each sector ID.
(> "sectors" could be considered the same as the resctrl "classes of
service") Then, an example of resource group is created as follows.
・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
・ L2: NUMAX-L2sector0 (X = 0,1,2,3)

In this example, sector with same ID(0) of all PEs is allocated to
resource group. The L1D caches are numbered from
NUMA0_PE0-L1sector0(0) to NUMA4_PE11-L1sector0(47) and the L2
caches
quoted
numbered from
NUMA0-L2sector0(0) to NUM4-L2sector0(3).
(NUMA number X is from 0-4, PE number Y is from 0-11)
(1) The number of ways of NUMAX-PEY-L1sector0 can be set
independently
quoted
quoted
     for each PEs (0-47). When run a task on this resource group,
     we cannot control on which PE the task is running on and how many
     cache ways the task is using.
resctrl does not control the affinity on which PE/CPU a task is run.
resctrl is an interface with which to configure how resources are
allocated on the system. resctrl could thus provide interface with
which each sector of each cache instance is assigned a number of cache
ways.
quoted
resctrl also provides an interface to assign a task with a class of
service (sector id?). Through this the task obtains access to all
resources that is allocated to the particular class of service (sector
id?). Depending on which CPU the task is running it may indeed
experience different performance if the sector id it is running with
does not have the same allocations on all cache instances. The affinity of the
task needs to be managed separately using for example taskset.
quoted
Please see Documentation/x86/resctrl.rst "Examples for RDT allocation
usage"

In resctrl_sched_in(), there are comments as follow:
  /*
 * If this task has a closid/rmid assigned, use it.
  * Else use the closid/rmid assigned to this cpu.
  */
I thought when we write PID to tasks file, this task (PID) will only run on the
CPUs which are specified in cpus file in the same resource group. So, the
task_struct's closid and cpu's closid is the same.
When task's closid is different from cpu's closid?


Best regards,
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help