String Byte Length Counter

Paste any text and instantly see its character count and UTF-8 byte size as you type.

0
0
0
0

Pure ASCII text weighs 1 byte per character. Accents and emoji weigh more.

This counter runs entirely in your browser. Nothing you type is uploaded or stored.

How to use this tool

Type or paste your text into the box above. The counters update live with every keystroke. You will see the number of characters, the UTF-8 byte size of the text data block, the UTF-16 code unit count, and the number of lines. Use the Copy text button to grab your text again, or Clear to start over. Because everything happens in the browser, you can safely measure private notes, API payloads, or config snippets without sending them anywhere.

How byte size is calculated

bytes = new TextEncoder().encode(text).length

A character is not always one byte. UTF-8 is a variable-width encoding: the standard ASCII range (plain English letters, digits, and common punctuation) uses 1 byte each, but accented Latin letters use 2 bytes, most other scripts use 3 bytes, and characters like emoji use 4 bytes. The character count above reports Unicode code points, while the byte count is the real on-the-wire weight you get from TextEncoder, which is what a server, file, or database column actually stores. That gap between character count and byte count is the ascii vs utf8 weight difference.

A real example

Take the string café written as the five characters c, a, f, e, and a combining accent. That is 5 code points. In UTF-8 the four ASCII letters are 1 byte each and the combining accent is 2 bytes, so the byte size of the text is 6 bytes. Now compare the plain word cafe with no accent: 4 characters and 4 bytes. The accent adds a character and two bytes, which is exactly why measuring byte size matters when a field has a strict limit like 255 bytes.

Common questions

Why is the byte count higher than the character count?

UTF-8 stores many characters in more than one byte. Accented letters take 2 bytes, symbols and many scripts take 3, and emoji take 4. Plain ASCII text is the only case where bytes equal characters.

What is the difference between characters and UTF-16 code units?

The character count here is Unicode code points. UTF-16 code units are how JavaScript strings measure length with .length, where characters outside the basic range (like emoji) count as two units. The two values differ whenever your text contains such characters.

Does this tool send my text anywhere?

No. The counting uses the browser TextEncoder API and runs fully client-side. Your text never leaves the page, so it is safe for private or sensitive content.

Which byte size should I use for a database limit?

Use the UTF-8 byte count. Most databases and APIs size text limits in bytes, not characters, so the UTF-8 figure is the number that must stay under your limit.

Why track text data block size at all?

Byte size affects API payload limits, SMS segment counts, storage costs, and column constraints. Knowing the real UTF-8 weight prevents truncation errors that a character count alone would hide.