HTML Encoding: Learn How to Use Meta Charset

TL;DR – Specifying the right HTML encoding will prevent the browser from failing to display special characters.

1. Understanding HTML Character Encoding
2. ASCII: The Most Basic Charset
3. Your Best Option: UTF-8
4. Alternative HTML Encodings
5. HTML Encoding: Useful Tips

Understanding HTML Character Encoding

The need for character encoding arises from the huge selection of characters available. Apart from your usual Latin letters and Arabic numbers, there are also foreign alphabets, mathematical symbols and other special characters. However, documents that have different HTML encodings defined can display them differently.

An incorrectly interpreted text leads to a variety of issues:

Users can't read the text properly
Search engines can't find the data
Machines can't process the information

All the available characters are grouped into specific sets (also called charsets for short). By defining HTML encoding, you let the browser access the particular set and display its characters correctly.

Note: the Japanese even have a special term for a poorly interpreted bunch of characters – mojibake (文字化け).

ASCII: The Most Basic Charset

The first and simplest HTML character encoding is called ASCII. Most modern charsets use it as a standard base.

ASCII stands for the American Standard Code for Information Interchange. It has been developed from telegraph code in the early 1960s and contains 128 characters, 95 of which are printable:

Lowercase Latin letters
Uppercase Latin letters
Punctuation symbols
Numbers from 0 to 9

The 33 unprintable characters are also called control characters. These are the transparent symbols – e.g., ones that allow separating words or paragraphs.

However, the popularity of ASCII fell as the Internet grew more and more international. Only supporting Latin characters quickly became not enough.

Pros

Easy to use with a learn-by-doing approach
Offers quality content
Gamified in-browser coding experience
The price matches the quality
Suitable for learners ranging from beginner to advanced

Main Features

Free certificates of completion
Focused on data science skills
Flexible learning timetable

GET 50% OFF

Pros

Simplistic design (no unnecessary information)
High-quality courses (even the free ones)
Variety of features

Main Features

Nanodegree programs
Suitable for enterprises
Paid Certificates of completion

UP TO 70% OFF

Pros

A wide range of learning programs
University-level courses
Easy to navigate
Verified certificates
Free learning track available

Main Features

University-level courses
Suitable for enterprises
Verified certificates of completion

FREE COURSES

Your Best Option: UTF-8

Unicode is the industry standard used for the consistency of character encoding. It was published in the early 1990s and has a few charsets, such as UTF-8, UTF-16, and UTF-32.

UTF-8 stands for Unicode Transformation Format 8-bit and has held the title of the most popular HTML character encoding since 2008. By 2019, more than 90 percent of all websites use UTF-8. It is also recommended to use as the default HTML character encoding by the World Web Consortium.

There are multiple compelling reasons to use UTF-8:

It supports many languages.
It is completely compatible with ASCII.
It is natively used by XML.
It uses less space than other Unicode encodings.

To declare UTF-8 as your preferred HTML character encoding, you will need to use the <meta> tag with the charset attribute and UTF-8 as its value:

<meta charset="UTF-8">

Alternative HTML Encodings

You can find a ton of alternative encodings in the Encoding Living Standard created by the Web Hypertext Application Technology Working Group (WHATWG). However, we'd strongly advise you to stay with UTF-8, as other charsets contain a smaller selection of characters, and that might cause issues in displaying your website.

HTML Encoding: Useful Tips

The proper display of your characters depends not only on the charset but also the chosen font: not all of them have versions for every character. If the font you chose does not have the symbol you need, it will either look for matches in other fonts or display another character (e.g., a question mark).
Don't forget that you also need to specify the HTML encoding when saving your document.

Previous Topic Next Topic

HTML Encoding: What Are Charsets and Which One to Use

Contents

Understanding HTML Character Encoding

ASCII: The Most Basic Charset

Your Best Option: UTF-8

Alternative HTML Encodings

HTML Encoding: Useful Tips

Best-rated MOOCs to Learn Programming:

Related Code Examples

HTML

HTML

HTML

HTML

HTML

HTML

DATACAMP DEAL: GET 25% OFF

HTML Encoding: What Are Charsets and Which One to Use

Contents

Understanding HTML Character Encoding

ASCII: The Most Basic Charset

Your Best Option: UTF-8

Alternative HTML Encodings

HTML Encoding: Useful Tips

HTML

HTML for Text Formatting

HTML Visuals and Media

HTML Tags

Best-rated MOOCs to Learn Programming:

Related Posts

HTML Head Tag: Tips and Tricks for Using HTML Meta Tags

XML Format: Creating Files and Validating Them With XML Formatters

Related Code Examples

HTML

HTML

HTML

HTML

HTML

HTML

DATACAMP DEAL: GET 25% OFF