Calculating String Bytes Count
Recently, I encountered a challenging problem at work. I was responsible for developing the file upload function, which stores multiple versions of a file based on the filename. When uploading a file, the total number of bytes in the filename is counted and spliced into the name as an identifier.
At first I used string.length as the filename byte count, which resulted in an identifier error. After I checked the wiki documentation, I have a new perception of character encoding.
Unicode Code Points
The charCodeAt() method of String values returns an integer between 0 and 65535 representing the UTF-16 code unit at the given index.
The codePointAt() method of String values returns a non-negative integer that is the Unicode code point value of the character starting at the given index. Note that the index is still based on UTF-16 code units, not Unicode code points.
Unicode code points range from 0 to 1114111 (0x10FFFF). charCodeAt() always returns a value that is less than 65536, because the higher code points are represented by a pair of 16-bit surrogate pseudo-characters. Therefore, in order to get a full character with value greater than 65535, it is necessary to retrieve not only charCodeAt(i), but also charCodeAt(i + 1), or to use codePointAt(i) instead.
So we use codePointAt(i) method to get the Unicode code point value at the given index.
const charCode = str