Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 44f06ba8 authored by Jan Kara's avatar Jan Kara
Browse files

udf: Fix leak of UTF-16 surrogates into encoded strings



OSTA UDF specification does not mention whether the CS0 charset in case
of two bytes per character encoding should be treated in UTF-16 or
UCS-2. The sample code in the standard does not treat UTF-16 surrogates
in any special way but on systems such as Windows which work in UTF-16
internally, filenames would be treated as being in UTF-16 effectively.
In Linux it is more difficult to handle characters outside of Base
Multilingual plane (beyond 0xffff) as NLS framework works with 2-byte
characters only. Just make sure we don't leak UTF-16 surrogates into the
resulting string when loading names from the filesystem for now.

CC: stable@vger.kernel.org # >= v4.6
Reported-by: default avatarMingye Wang <arthur200126@gmail.com>
Signed-off-by: default avatarJan Kara <jack@suse.cz>
parent 06856938
Loading
Loading
Loading
Loading
+6 −0
Original line number Diff line number Diff line
@@ -28,6 +28,9 @@

#include "udf_sb.h"

#define SURROGATE_MASK 0xfffff800
#define SURROGATE_PAIR 0x0000d800

static int udf_uni2char_utf8(wchar_t uni,
			     unsigned char *out,
			     int boundlen)
@@ -37,6 +40,9 @@ static int udf_uni2char_utf8(wchar_t uni,
	if (boundlen <= 0)
		return -ENAMETOOLONG;

	if ((uni & SURROGATE_MASK) == SURROGATE_PAIR)
		return -EINVAL;

	if (uni < 0x80) {
		out[u_len++] = (unsigned char)uni;
	} else if (uni < 0x800) {