Regular Expressions: Unicode

Join the AI Workshop to learn more about AI and how it can be applied to web development. Next cohort February 1st, 2026

The AI-first Web Development BOOTCAMP cohort starts February 24th, 2026. 10 weeks of intensive training and hands-on projects.


The u flag is mandatory when working with Unicode strings, in particular when you might need to handle characters in astral planes, the ones that are not included in the first 1600 Unicode characters.

Like Emojis, for example, but not just those.

If you donโ€™t add that flag, this simple regex that should match one character will not work, because for JavaScript that emoji is represented internally by 2 characters (see Unicode in JavaScript):

/^.$/.test('a') //โœ…
/^.$/.test('๐Ÿถ') //โŒ
/^.$/u.test('๐Ÿถ') //โœ…

So, always use the u flag.

Unicode, just like normal characters, handle ranges:

/[a-z]/.test('a')  //โœ…
/[1-9]/.test('1')  //โœ…

/[๐Ÿถ-๐ŸฆŠ]/u.test('๐Ÿบ')  //โœ…
/[๐Ÿถ-๐ŸฆŠ]/u.test('๐Ÿ›')  //โŒ

JavaScript checks the internal code representation, so ๐Ÿถ < ๐Ÿบ < ๐ŸฆŠ because \u1F436 < \u1F43A < \u1F98A. Check the full Emoji list to get those codes, and to find out the order (tip: the macOS Emoji picker has some emojis in a mixed order, donโ€™t count on it)

Lessons in this unit:

0: Introduction
1: Introduction
2: Anchoring
3: Match Items in Ranges
4: Matching a Range Item Multiple Times
5: Negating a Pattern
6: Meta Characters
7: Regular Expressions Choices
8: Quantifiers
9: Optional Items
10: Groups
11: Capturing Groups
12: Using match and exec Without Groups
13: Noncapturing Groups
14: Flags
15: Inspecting a Regex
16: Escaping
17: String Boundaries
18: Replacing
19: Greediness
20: Lookaheads
21: Lookbehinds
22: โ–ถ๏ธŽ Unicode
23: Unicode Property Escapes
24: Examples