Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit c04fc586 authored by Gary Hade's avatar Gary Hade Committed by Linus Torvalds
Browse files

mm: show node to memory section relationship with symlinks in sysfs



Show node to memory section relationship with symlinks in sysfs

Add /sys/devices/system/node/nodeX/memoryY symlinks for all
the memory sections located on nodeX.  For example:
/sys/devices/system/node/node1/memory135 -> ../../memory/memory135
indicates that memory section 135 resides on node1.

Also revises documentation to cover this change as well as updating
Documentation/ABI/testing/sysfs-devices-memory to include descriptions
of memory hotremove files 'phys_device', 'phys_index', and 'state'
that were previously not described there.

In addition to it always being a good policy to provide users with
the maximum possible amount of physical location information for
resources that can be hot-added and/or hot-removed, the following
are some (but likely not all) of the user benefits provided by
this change.
Immediate:
  - Provides information needed to determine the specific node
    on which a defective DIMM is located.  This will reduce system
    downtime when the node or defective DIMM is swapped out.
  - Prevents unintended onlining of a memory section that was
    previously offlined due to a defective DIMM.  This could happen
    during node hot-add when the user or node hot-add assist script
    onlines _all_ offlined sections due to user or script inability
    to identify the specific memory sections located on the hot-added
    node.  The consequences of reintroducing the defective memory
    could be ugly.
  - Provides information needed to vary the amount and distribution
    of memory on specific nodes for testing or debugging purposes.
Future:
  - Will provide information needed to identify the memory
    sections that need to be offlined prior to physical removal
    of a specific node.

Symlink creation during boot was tested on 2-node x86_64, 2-node
ppc64, and 2-node ia64 systems.  Symlink creation during physical
memory hot-add tested on a 2-node x86_64 system.

Signed-off-by: default avatarGary Hade <garyhade@us.ibm.com>
Signed-off-by: default avatarBadari Pulavarty <pbadari@us.ibm.com>
Acked-by: default avatarIngo Molnar <mingo@elte.hu>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent ee53a891
Loading
Loading
Loading
Loading
+50 −1
Original line number Diff line number Diff line
@@ -6,7 +6,6 @@ Description:
		internal state of the kernel memory blocks. Files could be
		added or removed dynamically to represent hot-add/remove
		operations.

Users:		hotplug memory add/remove tools
		https://w3.opensource.ibm.com/projects/powerpc-utils/

@@ -19,6 +18,56 @@ Description:
		This is useful for a user-level agent to determine
		identify removable sections of the memory before attempting
		potentially expensive hot-remove memory operation
Users:		hotplug memory remove tools
		https://w3.opensource.ibm.com/projects/powerpc-utils/

What:		/sys/devices/system/memory/memoryX/phys_device
Date:		September 2008
Contact:	Badari Pulavarty <pbadari@us.ibm.com>
Description:
		The file /sys/devices/system/memory/memoryX/phys_device
		is read-only and is designed to show the name of physical
		memory device.  Implementation is currently incomplete.

What:		/sys/devices/system/memory/memoryX/phys_index
Date:		September 2008
Contact:	Badari Pulavarty <pbadari@us.ibm.com>
Description:
		The file /sys/devices/system/memory/memoryX/phys_index
		is read-only and contains the section ID in hexadecimal
		which is equivalent to decimal X contained in the
		memory section directory name.

What:		/sys/devices/system/memory/memoryX/state
Date:		September 2008
Contact:	Badari Pulavarty <pbadari@us.ibm.com>
Description:
		The file /sys/devices/system/memory/memoryX/state
		is read-write.  When read, it's contents show the
		online/offline state of the memory section.  When written,
		root can toggle the the online/offline state of a removable
		memory section (see removable file description above)
		using the following commands.
		# echo online > /sys/devices/system/memory/memoryX/state
		# echo offline > /sys/devices/system/memory/memoryX/state

		For example, if /sys/devices/system/memory/memory22/removable
		contains a value of 1 and
		/sys/devices/system/memory/memory22/state contains the
		string "online" the following command can be executed by
		by root to offline that section.
		# echo offline > /sys/devices/system/memory/memory22/state
Users:		hotplug memory remove tools
		https://w3.opensource.ibm.com/projects/powerpc-utils/

What:		/sys/devices/system/node/nodeX/memoryY
Date:		September 2008
Contact:	Gary Hade <garyhade@us.ibm.com>
Description:
		When CONFIG_NUMA is enabled
		/sys/devices/system/node/nodeX/memoryY is a symbolic link that
		points to the corresponding /sys/devices/system/memory/memoryY
		memory section directory.  For example, the following symbolic
		link is created for memory section 9 on node0.
		/sys/devices/system/node/node0/memory9 -> ../../memory/memory9
+13 −3
Original line number Diff line number Diff line
@@ -124,7 +124,7 @@ config options.
    This option can be kernel module too.

--------------------------------
3 sysfs files for memory hotplug
4 sysfs files for memory hotplug
--------------------------------
All sections have their device information under /sys/devices/system/memory as

@@ -138,11 +138,12 @@ For example, assume 1GiB section size. A device for a memory starting at
(0x100000000 / 1Gib = 4)
This device covers address range [0x100000000 ... 0x140000000)

Under each section, you can see 3 files.
Under each section, you can see 4 files.

/sys/devices/system/memory/memoryXXX/phys_index
/sys/devices/system/memory/memoryXXX/phys_device
/sys/devices/system/memory/memoryXXX/state
/sys/devices/system/memory/memoryXXX/removable

'phys_index' : read-only and contains section id, same as XXX.
'state'      : read-write
@@ -150,10 +151,20 @@ Under each section, you can see 3 files.
               at write: user can specify "online", "offline" command
'phys_device': read-only: designed to show the name of physical memory device.
               This is not well implemented now.
'removable'  : read-only: contains an integer value indicating
               whether the memory section is removable or not
               removable.  A value of 1 indicates that the memory
               section is removable and a value of 0 indicates that
               it is not removable.

NOTE:
  These directories/files appear after physical memory hotplug phase.

If CONFIG_NUMA is enabled the
/sys/devices/system/memory/memoryXXX memory section
directories can also be accessed via symbolic links located in
the /sys/devices/system/node/node* directories.  For example:
/sys/devices/system/node/node0/memory9 -> ../../memory/memory9

--------------------------------
4. Physical memory hot-add phase
@@ -365,7 +376,6 @@ node if necessary.
  - allowing memory hot-add to ZONE_MOVABLE. maybe we need some switch like
    sysctl or new control file.
  - showing memory section and physical device relationship.
  - showing memory section and node relationship (maybe good for NUMA)
  - showing memory section is under ZONE_MOVABLE or not
  - test and make it better memory offlining.
  - support HugeTLB page migration and offlining.
+1 −1
Original line number Diff line number Diff line
@@ -692,7 +692,7 @@ int arch_add_memory(int nid, u64 start, u64 size)
	pgdat = NODE_DATA(nid);

	zone = pgdat->node_zones + ZONE_NORMAL;
	ret = __add_pages(zone, start_pfn, nr_pages);
	ret = __add_pages(nid, zone, start_pfn, nr_pages);

	if (ret)
		printk("%s: Problem encountered in __add_pages() as ret=%d\n",
+1 −1
Original line number Diff line number Diff line
@@ -132,7 +132,7 @@ int arch_add_memory(int nid, u64 start, u64 size)
	/* this should work for most non-highmem platforms */
	zone = pgdata->node_zones;

	return __add_pages(zone, start_pfn, nr_pages);
	return __add_pages(nid, zone, start_pfn, nr_pages);
}
#endif /* CONFIG_MEMORY_HOTPLUG */

+1 −1
Original line number Diff line number Diff line
@@ -183,7 +183,7 @@ int arch_add_memory(int nid, u64 start, u64 size)
	rc = vmem_add_mapping(start, size);
	if (rc)
		return rc;
	rc = __add_pages(zone, PFN_DOWN(start), PFN_DOWN(size));
	rc = __add_pages(nid, zone, PFN_DOWN(start), PFN_DOWN(size));
	if (rc)
		vmem_remove_mapping(start, size);
	return rc;
Loading