Optimizing URL Length: How Long Can a URL Be and Why It Matters?

Have you ever wondered about the maximum length of a website address? The length of URLs is not really something you need to stress over. In practice, if you are not using accented or non-Latin characters in the web address, the chances of hitting the limit are slim.

This topic rarely comes up in discussion, although it may have some impact on UX and SEO. If you are interested in this or are considering using a writing system other than Latin in your permalinks, this article will explain how it affects their length.

Is There A Limit On The Length Of The URLs?

What Is the Longest URL Possible?

The HTTP specification (RFC 2616 & RFC 3986) itself does not define a maximum length, but there are practical limits based on user's browser and server configuration.

Firefox, Chrome, Safari and other modern browsers generally support longer URLs. However, sitemaps submitted to search engines have a maximum URL length of 2,048 characters. This is the limit search engine crawlers can reliably process.

Even though John Mueller said that URL length is not a scoring factor for Google, he also said that Google does crawl and analyze URLs with more than 1000 characters, but that does not mean it is a good idea.

To sum up, the URL length limits exist primarily due to technical restrictions in early web protocols (RFC). The original ASCII character set used for URLs could only represent a finite number of bytes. While modern browsers support much longer addresses, staying under 2000 characters ensures wide backward compatibility.

Why Are Accented or Non-Latin Urls Longer Than They Appear?

The URL is longer if it contains non-Latin characters than if it only contains ASCII letters. The difference comes from the fact that each ASCII letter, which includes Latin characters, only takes one byte to write. Non-ASCII characters, however, require additional bytes while being transmitted.

Additionally, non-ASCII characters must be converted into a standardized format that all web browsers can interpret. This conversion process, called percent-encoding, replaces each special character with its equivalent hexadecimal value prefixed with "%". For instance, the Japanese character " " would become " %E3%81%AB ".

As a result, URLs containing non-English text undergo this percent-encoding, thereby increasing their overall length. Take a look at the simple URL examples below. The first one is merely 23 bytes long and contains only regular ASCII characters. The equivalent URL in Greek is only 5 letters longer, but it is nearly twice in bytes!

Sample URLs with ASCII characters.

Why URL Length Matters?

While there is no definitive limit, URLs that are too long can cause issues. If a URL exceeds around 2000 characters, it may not load properly or cause errors in some browsers.

Additionally, very long URLs are less readable and shareable for users. It becomes difficult to distinguish one long URL from another when copied or typed out. This negatively impacts the usability and user experience of a website.

Furthermore, search engines also prefer shorter, more concise URLs. Long URLs do not parse as well in search engine results and may be truncated. They are also less likely to be fully clicked on by searchers.

Tips for Optimizing URL Length

If you intend to incorporate special characters or non-Latin letters in your URLs, it is worth considering the implications.

While ensuring ASCII-only URLs is essential for compatibility reasons, keep in mind that changing the URL structure too drastically can affect SEO. Try to maintain the core keywords and structure of the original URL to minimize any negative impact on your website's ranking.

Handling Non-ASCII Characters in URLs

Transliteration is without a doubt one of the most straightforward methods that will reduce the length of URLs and enhance their readability. Transliterating URLs involves replacing non-Latin characters with their closest Latin equivalents.

Transliterate URLs

Using the familiar Latin characters that make up the bulk of the English language allows URLs to be more readable at a glance for many internet users.

The Permalink Manager, for example, gives such an option, but you will have to use an additional custom code snippet to automatically transliterate URLs.

How WordPress Removes Diactricts And Accents From Slugs?

When working with a writing system on the Latin alphabet, like German, French, Spanish, or Polish, things get simpler. If you create a new post, WordPress will automatically remove any diacritical marks from the slug hrough its built-in remove_accents() function.

For example, if you publish a post titled "Café Menu" with the "é" character, WordPress will automatically convert it and use "e" instead inside the slug. This way, the final URL ("example.com/cafe-menu") will be fully ASCII compatible.

Translate The URLs to English

While WordPress automatically removes diacritics from post titles and URLs, you may still want more control over URL slugs. The Simple Slug Translate plugin provides an easy way to automatically translate slugs to English.

Translate URLs

There is no need for advanced technical skills or custom coding, as the plugin handles the translation process seamlessly without disrupting the original content. When you add a new post or edit an existing one, the slug is automatically converted into English. This applies to pages, categories, and taxonomies as well.

by Maciej Bis

Leave a Reply

Your email address will not be published. Required fields are marked *