Have you ever wondered about the maximum length of a website address? The length of URLs is not really something you need to stress over. In practice, if you are not using accented or non-Latin characters in the web address, the chances of hitting the limit are slim.
This topic rarely comes up in discussion, although it may have some impact on UX and SEO. If you are interested in this or are considering using a writing system other than Latin in your permalinks, this article will explain how it affects their length.
Is There A Limit On The Length Of The URLs?
What Is the Longest URL Possible?
The HTTP specification (RFC 2616 & RFC 3986) itself does not define a maximum length, but there are practical limits based on user's browser and server configuration.
Firefox, Chrome, Safari and other modern browsers generally support longer URLs. However, sitemaps submitted to search engines have a maximum URL length of 2,048 characters. This is the limit search engine crawlers can reliably process.
To sum up, the URL length limits exist primarily due to technical restrictions in early web protocols (RFC). The original ASCII character set used for URLs could only represent a finite number of bytes. While modern browsers support much longer addresses, staying under 2000 characters ensures wide backward compatibility.
Chasing short URLs for the sake of it makes no sense because shortening URLs does not automatically rank your website higher in search results. John Mueller once made a point about URL length, saying it is not a scoring factor for Google. But that is not the whole picture. He also mentioned that Google can crawl URLs, even if they are longer than 1000 characters.
Why Are Accented or Non-Latin Urls Longer Than They Appear?
The URL is longer if it contains non-Latin characters than if it only contains ASCII letters. The difference comes from the fact that each ASCII letter, which includes Latin characters, only takes one byte to write. Non-ASCII characters, however, require additional bytes while being transmitted.
Additionally, non-ASCII characters must be converted into a standardized format that all web browsers can interpret. This conversion process, called percent-encoding, replaces each special character with its equivalent hexadecimal value prefixed with "%". For instance, the Japanese character "
に
" would become "
%E3%81%AB
".
As a result, URLs containing non-English text undergo this percent-encoding, thereby increasing their overall length. Take a look at the simple URL examples below. The first one is merely 23 bytes long and contains only regular ASCII characters. The equivalent URL in Greek is only 5 letters longer, but it is nearly twice in bytes!
Why URL Length Matters?
Technically, there is no exact rule about how long a URL can be. But it grows beyond around 2000 characters, some browsers, especially outdated ones, might not be able to process it correctly.
Really long URLs are not just a hassle for browsers. They can be confusing for visitors, too. A lengthy link is harder to read, share. It becomes difficult to distinguish from another when copied or typed out. This can negatively impact the usability and general user experience of your website.
Even though search engines might not care much about the length anymore, people still do. In search results, long links can get shortened and look confusing, which makes visitors less likely to choose your site.
Tips for Optimizing URL Length
Using standard English letters usually keeps a URL short and simple. But if you feel like adding special symbols, like emojis or non-English characters, the URL might stretch out more than you expect.
Handling Non-ASCII Characters in URLs
Transliteration is without a doubt one of the most straightforward methods that will reduce the length of URLs and enhance their readability. Transliterating URLs involves replacing non-Latin characters with their closest Latin equivalents.
Using the familiar Latin characters that make up the bulk of the English language allows URLs to be more readable at a glance for many internet users.
How WordPress Removes Diactricts And Accents From Slugs?
When working with a writing system on the Latin alphabet, like German, French, Spanish, or Polish, things get simpler. If you create a new post, WordPress will automatically remove any diacritical marks from the slug hrough its built-in remove_accents() function.
For example, if you publish a post titled "Café Menu" with the "é" character, WordPress will automatically convert it and use "e" instead inside the slug. This way, the final URL ("example.com/cafe-menu") will be fully ASCII compatible.
Translate The URLs to English
While WordPress automatically removes diacritics from post titles and URLs, you may still want more control over URL slugs. The Simple Slug Translate plugin provides an easy way to automatically translate slugs to English.
There is no need for advanced technical skills or custom coding, as the plugin handles the translation process seamlessly without disrupting the original content. When you add a new post or edit an existing one, the slug is automatically converted into English. This applies to pages, categories, and taxonomies as well.
Leave a Reply