Appearance
Unicode
Modern terminals are UTF-8 terminals: applications write UTF-8 bytes, the terminal decodes them into Unicode code points, and the screen grid assigns each displayed grapheme a cell width. Encoding is now the easy part; width is where terminals disagree. East Asian ambiguous-width characters, emoji, combining marks, variation selectors, zero-width joiners, wide-character wrapping, and tab stops with mixed-width text all affect cursor alignment. Correct Unicode handling is essential for TUI applications because a one-cell disagreement between the app and terminal shifts every column to the right. The wcwidth() function predates emoji, terminals track different Unicode versions, and there is no single standard for grapheme-cluster width.
Terminal Unicode support has two layers. UTF-8 decoding is the implemented baseline: a modern terminal is expected to accept UTF-8 bytes and render Unicode text. Cell-width calculation is the hard part: after decoding, the terminal has to decide whether the displayed grapheme occupies 0, 1, or 2 grid cells, and the application has to make the same decision before moving the cursor.
The recurring width problems are: East Asian Width (is a character 1 or 2 columns wide?), grapheme clustering (a flag emoji like U+1F1F3 U+1F1F4 should display as one glyph, not two), and variation selectors (U+FE0E forces text presentation, U+FE0F forces emoji presentation). UAX #11 provides East_Asian_Width properties, but ambiguous-width characters vary by locale and terminal policy.
The most treacherous case is zero-width joiners (ZWJ, U+200D). A ZWJ sequence like woman + ZWJ + laptop should render as a single emoji glyph if the terminal's font supports it, but as separate characters if it does not. The terminal must either trust the font's ligature tables or maintain its own ZWJ sequence database, and that database changes with every Unicode release.
For developers, the practical test is simple: does the cursor end up in the right place after printing a string? If a terminal calculates "hello" + flag_emoji as 7 columns but the font renders the flag as 2 columns, every subsequent character on that line will be offset. This breaks table alignment, progress bars, box drawing, and any TUI that relies on precise cursor positioning. The wcwidth() function and its many implementations are the battleground where these disagreements play out.
Analysis2026-05-17
Terminal Applications
Headless Backends
Parser correctness tested via Termless. A ✓ means the parser accepts the sequence, not that it renders correctly.