AI Workshop: learn to build apps with AI →
Regular Expressions: Unicode

Join the AI Workshop and learn to build real-world apps with AI. A hands-on, practical program to level up your skills.


The u flag is mandatory when working with Unicode strings, in particular when you might need to handle characters in astral planes, the ones that are not included in the first 1600 Unicode characters.

Like Emojis, for example, but not just those.

If you donโ€™t add that flag, this simple regex that should match one character will not work, because in JavaScript that emoji is represented internally by two characters (see Unicode in JavaScript):

/^.$/.test('a') //โœ…
/^.$/.test('๐Ÿถ') //โŒ
/^.$/u.test('๐Ÿถ') //โœ…

So, always use the u flag.

Unicode, just like normal characters, can be used in ranges:

/[a-z]/.test('a')  //โœ…
/[1-9]/.test('1')  //โœ…

/[๐Ÿถ-๐ŸฆŠ]/u.test('๐Ÿบ')  //โœ…
/[๐Ÿถ-๐ŸฆŠ]/u.test('๐Ÿ›')  //โŒ

JavaScript checks the internal code representation, so ๐Ÿถ < ๐Ÿบ < ๐ŸฆŠ because \u1F436 < \u1F43A < \u1F98A. Check the full Emoji list to get those codes and to find out the order (tip: the macOS Emoji picker has some emojis in a mixed order, so donโ€™t count on it).

Lessons in this unit:

0: Introduction
1: Introduction
2: Anchoring
3: Match Items in Ranges
4: Matching a Range Item Multiple Times
5: Negating a Pattern
6: Meta Characters
7: Regular Expressions Choices
8: Quantifiers
9: Optional Items
10: Groups
11: Capturing Groups
12: Using match and exec Without Groups
13: Noncapturing Groups
14: Flags
15: Inspecting a Regex
16: Escaping
17: String Boundaries
18: Replacing
19: Greediness
20: Lookaheads
21: Lookbehinds
22: โ–ถ๏ธŽ Unicode
23: Unicode Property Escapes
24: Examples