UTF-8 Encoding UTF-8 is a compromise character encoding that can be as compact as ASCII (if the file is just plain English text) but can also contain any unicode characters http://www.fileformat.info/info/unicode/utf8.htm

https://dzone.com/articles/an-insight-into-unicode-utf-8-and-their-usage | Understanding the difference between unicode and utf8 for international websites. - Every Software Developer Absolutely needs to understand that UTF-8 is a compromise character encoding that can be as compact as ASCII. Unicode was a brave effort to create a single character set that included every reasonable writing system on the planet and some make-believe ones like Klingon.

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) - Joel on Software

Characters, Symbols and the Unicode Miracle - Computerphile // How did we get to UTF-8 as the web standard? Tom Scott capsulizes the "Unicode Miracle" in a few minutes.

Convert your text files from any encoding to any other one. The screen shot shows Japanese, English and Thai text encoded as UTF-8 Unicode. Conversion to the legacy Thai code page would lose the Japanese characters.

