It finally happened to me. A user entered an emoji into an input field, and a production system broke.

I don’t want to get into the weeds of the particular system, except to say that it’s a strictly ordered queue that couldn’t move forward because an entry was being rejected by a database validation check.

Three pieces of background:

  1. MariaDB implents its JSON column data type by combining the regular LONGTEXT column type with a validation constraint that runs inserted or updated values through json_valid(). Column definition looks something like this: column_name longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid('column_name')). I don’t how this differes from MySQL, if at all.
  2. The JSON RFC states that only a handful of Unicode characters inside strings have to be escaped, but all of them may be escaped.
    2.5.  Strings
    
    [...] All Unicode characters may be placed within the
    quotation marks except for the characters that must be escaped:
    quotation mark, reverse solidus, and the control characters (U+0000
    through U+001F).
    
    Any character may be escaped.
    
  3. PHP’s json_encode() escapes all Unicode characters by default. (This behavior can be disabled by passing the JSON_UNESCAPED_UNICODE option)

The issue presented itself as database inserts failing because the JSON validation was failing, but only if the emoji in question (😏) was escaped (\ud83d\ude0f). Running the data manually through json_valid() succeeded though. This seems to indicate that MariaDB was using different code paths for column validation and the query function.

We were already running a version of MariaDB that didn’t exhibit this issue (10.4.11) in another environment, so rather than making a bunch of code changes we added some tests to check for this issue and upgrade the production docker images with the new database server version. I checked the changelogs and saw no mention of this issue, though, so it’s possible this was fixed unintentionally and might resurface in a future build. I’ll look into filing a bug for it.