diff --git a/scratch/unicode.erl b/scratch/unicode.erl new file mode 100644 index 0000000..bdcf348 --- /dev/null +++ b/scratch/unicode.erl @@ -0,0 +1,52 @@ +% @doc +%
+%                  T R O N A L D   D U M P
+%
+%                     .-""""""""""""-.
+%                  .-'  _..------.._  '-.
+%                .'   .'  GOLDEN NFC '.   '.
+%               /    /  COMB-OVER MAP  \    \
+%              ;    ;  .-^^^^^^^^^^-.   ;    ;
+%              |    | /  THEY'RE     \  |    |
+%              |    | | NOT SENDING  |  |    |
+%              |    | |    ASCII     |  |    |
+%              ;    ; \_.--.  .--._./  ;    ;
+%               \    \    (o)(o)      /    /
+%                '.   '.    __       .'   .'
+%                  '-._  '._==_.'  _.-'
+%                      '-._____.-'
+%                         /|||\
+%                        / ||| \
+%                       /  |||  \
+%              .-------'   |||   '-------.
+%             /      THE BEST NORMALIZER   \
+%            /     VERY STABLE CODEPOINTS   \
+%           /_________________________________\
+% 
+% +% When unicode sends its codepoints, they're not +% sending their best. They're not sending ASCII. +% They're not sending ASCII. They're sending integers +% that have lots of problems, and they're bringing +% those problems with us. They're bringing diacritics. +% They're bringing non-idempotent lowercasing. They're +% bringing graphemes that don't correspond bijectively +% with printable characters. They're bringing RTL. +% They're bringing invisible characters. They're +% bringing characters that draw outside the character +% boundary. They're bringing variable-width +% whitespace. They're bringing control characters. +% They're bringing emojis. +% +% And some, I assume, are good characters. +% +% `SrcStr' is a unicode NFC list, not an ordinary +% string. you think a string is a list of codepoints. +% +% NOOOOO. +% +% See it's different, because that's why. +% +% This is the cost of diversity, folks. +% @end +