BACKPORT: support for the mount namespace (Magisk Hide!)
Used backport for 3.4, all conflicts resolved. SafetyNet successfully passed on m0. Commits used: vfs: Add setns support for the mount namespace setns support for the mount namespace is a little tricky as an arbitrary decision must be made about what to set fs->root and fs->pwd to, as there is no expectation of a relationship between the two mount namespaces. Therefore I arbitrarily find the root mount point, and follow every mount on top of it to find the top of the mount stack. Then I set fs->root and fs->pwd to that location. The topmost root of the mount stack seems like a reasonable place to be. Bind mount support for the mount namespace inodes has the possibility of creating circular dependencies between mount namespaces. Circular dependencies can result in loops that prevent mount namespaces from every being freed. I avoid creating those circular dependencies by adding a sequence number to the mount namespace and require all bind mounts be of a younger mount namespace into an older mount namespace. Add a helper function proc_ns_inode so it is possible to detect when we are attempting to bind mound a namespace inode. Acked-by:Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> (cherry picked from commit 8823c07) ---- vfs: Only support slave subtrees across different user namespaces Sharing mount subtress with mount namespaces created by unprivileged users allows unprivileged mounts created by unprivileged users to propagate to mount namespaces controlled by privileged users. Prevent nasty consequences by changing shared subtrees to slave subtress when an unprivileged users creates a new mount namespace. Acked-by:
Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> ---- vfs: Allow unprivileged manipulation of the mount namespace. - Add a filesystem flag to mark filesystems that are safe to mount as an unprivileged user. - Add a filesystem flag to mark filesystems that don't need MNT_NODEV when mounted by an unprivileged user. - Relax the permission checks to allow unprivileged users that have CAP_SYS_ADMIN permissions in the user namespace referred to by the current mount namespace to be allowed to mount, unmount, and move filesystems. Acked-by:
"Serge E. Hallyn" <serge@hallyn.com> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> ---- vfs: Add a user namespace reference from struct mnt_namespace This will allow for support for unprivileged mounts in a new user namespace. Acked-by:
"Serge E. Hallyn" <serge@hallyn.com> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> ---- proc: Generalize proc inode allocation Generalize the proc inode allocation so that it can be used without having to having to create a proc_dir_entry. This will allow namespace file descriptors to remain light weight entitities but still have the same inode number when the backing namespace is the same. Acked-by:
Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> ---- proc: Fix the namespace inode permission checks. Change the proc namespace files into symlinks so that we won't cache the dentries for the namespace files which can bypass the ptrace_may_access checks. To support the symlinks create an additional namespace inode with it's own set of operations distinct from the proc pid inode and dentry methods as those no longer make sense. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> ---- proc: Usable inode numbers for the namespace file descriptors. Assign a unique proc inode to each namespace, and use that inode number to ensure we only allocate at most one proc inode for every namespace in proc. A single proc inode per namespace allows userspace to test to see if two processes are in the same namespace. This has been a long requested feature and only blocked because a naive implementation would put the id in a global space and would ultimately require having a namespace for the names of namespaces, making migration and certain virtualization tricks impossible. We still don't have per superblock inode numbers for proc, which appears necessary for application unaware checkpoint/restart and migrations (if the application is using namespace file descriptors) but that is now allowd by the design if it becomes important. I have preallocated the ipc and uts initial proc inode numbers so their structures can be statically initialized. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Samuel Pascua <pascua.samuel.14@gmail.com>
Loading
Please register or sign in to comment