Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 2022b415 authored by Masahiro Yamada's avatar Masahiro Yamada Committed by Jaegeuk Kim
Browse files

unicode: refactor the rule for regenerating utf8data.h



scripts/mkutf8data is used only when regenerating utf8data.h,
which never happens in the normal kernel build. However, it is
irrespectively built if CONFIG_UNICODE is enabled.

Moreover, there is no good reason for it to reside in the scripts/
directory since it is only used in fs/unicode/.

Hence, move it from scripts/ to fs/unicode/.

In some cases, we bypass build artifacts in the normal build. The
conventional way to do so is to surround the code with ifdef REGENERATE_*.

For example,

 - 7373f4f8 ("kbuild: add implicit rules for parser generation")
 - 6aaf49b4 ("crypto: arm,arm64 - Fix random regeneration of S_shipped")

I rewrote the rule in a more kbuild'ish style.

In the normal build, utf8data.h is just shipped from the check-in file.

$ make
  [ snip ]
  SHIPPED fs/unicode/utf8data.h
  CC      fs/unicode/utf8-norm.o
  CC      fs/unicode/utf8-core.o
  CC      fs/unicode/utf8-selftest.o
  AR      fs/unicode/built-in.a

If you want to generate utf8data.h based on UCD, put *.txt files into
fs/unicode/, then pass REGENERATE_UTF8DATA=1 from the command line.
The mkutf8data tool will be automatically compiled to generate the
utf8data.h from the *.txt files.

$ make REGENERATE_UTF8DATA=1
  [ snip ]
  HOSTCC  fs/unicode/mkutf8data
  GEN     fs/unicode/utf8data.h
  CC      fs/unicode/utf8-norm.o
  CC      fs/unicode/utf8-core.o
  CC      fs/unicode/utf8-selftest.o
  AR      fs/unicode/built-in.a

I renamed the check-in utf8data.h to utf8data.h_shipped so that this
will work for the out-of-tree build.

You can update it based on the latest UCD like this:

$ make REGENERATE_UTF8DATA=1 fs/unicode/
$ cp fs/unicode/utf8data.h fs/unicode/utf8data.h_shipped

Also, I added entries to .gitignore and dontdiff.

Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
parent ad2d18a3
Loading
Loading
Loading
Loading
+2 −0
Original line number Diff line number Diff line
@@ -177,6 +177,7 @@ mkprep
mkregtable
mktables
mktree
mkutf8data
modpost
modules.builtin
modules.order
@@ -255,6 +256,7 @@ vsyscall_32.lds
wanxlfw.inc
uImage
unifdef
utf8data.h
wakeup.bin
wakeup.elf
wakeup.lds

fs/unicode/.gitignore

0 → 100644
+2 −0
Original line number Diff line number Diff line
mkutf8data
utf8data.h
+30 −11
Original line number Diff line number Diff line
@@ -5,15 +5,34 @@ obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o

unicode-y := utf8-norm.o utf8-core.o

# This rule is not invoked during the kernel compilation.  It is used to
# regenerate the utf8data.h header file.
utf8data.h.new: *.txt $(objdir)/scripts/mkutf8data
	$(objdir)/scripts/mkutf8data \
		-a DerivedAge.txt \
		-c DerivedCombiningClass.txt \
		-p DerivedCoreProperties.txt \
		-d UnicodeData.txt \
		-f CaseFolding.txt \
		-n NormalizationCorrections.txt \
		-t NormalizationTest.txt \
$(obj)/utf8-norm.o: $(obj)/utf8data.h

# In the normal build, the checked-in utf8data.h is just shipped.
#
# To generate utf8data.h from UCD, put *.txt files in this directory
# and pass REGENERATE_UTF8DATA=1 from the command line.
ifdef REGENERATE_UTF8DATA

quiet_cmd_utf8data = GEN     $@
      cmd_utf8data = $< \
		-a $(srctree)/$(src)/DerivedAge.txt \
		-c $(srctree)/$(src)/DerivedCombiningClass.txt \
		-p $(srctree)/$(src)/DerivedCoreProperties.txt \
		-d $(srctree)/$(src)/UnicodeData.txt \
		-f $(srctree)/$(src)/CaseFolding.txt \
		-n $(srctree)/$(src)/NormalizationCorrections.txt \
		-t $(srctree)/$(src)/NormalizationTest.txt \
		-o $@

$(obj)/utf8data.h: $(obj)/mkutf8data $(filter %.txt, $(cmd_utf8data)) FORCE
	$(call if_changed,utf8data)

else

$(obj)/utf8data.h: $(src)/utf8data.h_shipped FORCE
	$(call if_changed,shipped)

endif

targets += utf8data.h
hostprogs-y += mkutf8data
+4 −5
Original line number Diff line number Diff line
@@ -55,15 +55,14 @@ released version of the UCD can be found here:

  http://www.unicode.org/Public/UCD/latest/

To build the utf8data.h file, from a kernel tree that has been built,
cd to this directory (fs/unicode) and run this command:
Then, build under fs/unicode/ with REGENERATE_UTF8DATA=1:

	make C=../.. objdir=../.. utf8data.h.new
	make REGENERATE_UTF8DATA=1 fs/unicode/

After sanity checking the newly generated utf8data.h.new file (the
After sanity checking the newly generated utf8data.h file (the
version generated from the 12.1.0 UCD should be 4,109 lines long, and
have a total size of 324k) and/or comparing it with the older version
of utf8data.h, rename it to utf8data.h.
of utf8data.h_shipped, rename it to utf8data.h_shipped.

If you are a kernel developer updating to a newer version of the
Unicode Character Database, please update this README.utf8data file
+0 −0

File moved.

Loading