Information Today, Inc. Corporate Site KMWorld CRM Media Streaming Media Faulkner Speech Technology DBTA/Unisphere
PRIVACY/COOKIES POLICY
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



Vendors: For commercial reprints in print or digital form, contact LaShawn Fugate (lashawn@infotoday.com).
Magazines > Online Searcher
Back Forward

ONLINE SEARCHER: Information Discovery, Technology, Strategies

HOME

Pages: 1| 2
Top-Level Domain Name Explosions: Teapot Tempest
By
May/June 2016 Issue

Once upon a time, oh about 20 years ago or so, interpreting and perhaps understanding top-level domain (TLD) names was a fairly simple thing to do. This is not quite so true today.

The number of TLDs has mushroomed in recent years, and growth is expected to continue. Over the years, I have documented these changes in the pages of Searcher magazine and Information Today, Inc.’s NewsBreaks [4–9], as has Greg Notess in ONLINE magazine [10, 11].

A TLD is the symbol set of a URL to the immediate right of the final “expressed” dot. For example, the TDL in the URL infotoday.com is .com. TLDs come in essentially two formats: generic (gTLD) and country code (ccTLD). The Internet Assigned Number Authority provides an up-to-date list of TLDs (data.iana.org/TLD/tlds-alpha-by-domain.txt).

The ccTLDs consist of the two-character (alpha-2) ISO (International Organization for Standards) 3166 symbol (iso.org/iso/country_codes). These can be for the following:

  • Existing countries (.ca for Canada, .fr for France, .gb for the U.K. and Northern Ireland, and .us for the United States)
  • Countries that no longer exist (.su for the USSR or .yu for Yugoslavia)
  • Alternative abbreviations (.uk for the United Kingdom)
  • Territories and national subdivisions (.bv for Bouvet Island or .im for the Isle of Man)

In addition, some supranational groups such as the European Union are also represented (.eu). There are some variations; for example, sponsored top-level domains (sTLD) are a subclass of gTLDs. As early as 2001, Barbara Quint and I suggested .lib as an sTLD [7, 9], one yet to come to fruition. Sigh.

GROWTH SPURT

Not only has the number of TLDs increased, so have the character sets (languages) that may be used. These Internationalized Domain Names (IDN; icann.org/resources/pages/idn-2012-02-25-en) include Arabic, Chinese, Cyrillic, Hebrew, Latin (with accent and diacritical marks), Japanese, and Tamil. The IANA explains the process to expand the IDN list at iana.org/help/idn-repository-procedure. Additional IDN character sets are dependent upon the Domain Name System (DNS) recognizing the Unicode (Punycode) symbol set for those characters.

One important factor to remember: Each country is responsible for its ccTLD. That said, individual countries can set the rules for use of their ccTLDs. Some manage their own registrars and limit registrations to websites from or about the country, as does Tunisia. Many maintain the registrar but have a more liberal policy for registrations. Others, especially those with ccTLDs that convey meaning other than national identity, find that farming out registrar duties and registrations has financial benefits. The poster child for this is .tv for Tuvalu. Many television stations have registered using .tv, although they usually resolve to a .com. Does anyone really think the URL shortened bitl.ly originates in Libya? These latter ccTLDs I once termed “ccTLDs of convenience” [9]. Though such ccTLDs may be convenient, they are very often also redundant.

There are a number of reserved names for international governmental organizations (IGOs) and certain nongovernmental organizations (NGOs). These include the original .gov, .mil, and .edu TLDs as well as .int, but also variations of .redcross and .olympic, among others. Note too that gTLDs have moved way beyond the original three-character format to multiple characters. Many of the reserve domain names are reserved at both the top level (after the last dot) and the second level, or 2LD. For example, for the URL infotoday.com, the .com is the TLD and infotoday, the 2LD. For the definitive and up-to-date list, see icann.org/sites/default/files/packages/reserved-names/ReservedNames.xml.

FLYING INTO THE STORM

The proliferation of sTLDs will likely make life very inter esting for the searcher community. For example, the .aero sTLD was created for and restricted to the aeronautical in dustry in 2001. By 2006, the registry rules were relaxed so that private pilots could register at .aero as well (domainregistry.de/pilot.html). A quick check of delta.aero, clearly a member of the aeronautical community, resolves to Delta Airlines’ delta.com—again a redundancy.

The .aero sTLD was among the first created of what could grow into a seemingly unlimited sTLD set. These sTLDs range from .museum (for museums), .travel (for the travel industry) through .xxx and .sex (for the adult entertainment industry). Other gTLDs have a quasi-sTLD function in that they address a specific constituency, one example being .name. When my grandson was born 8 years ago, I registered his full name in .name—a great gift that has never been used.

There has been, and probably will continue to be, a proliferation of IDNs, sTLDs, and focused gTLDs on the top-level domain landscape. On the one hand, this proliferation forces the searcher community to be ever more vigilant and aware of changes to top level domains, some utilizing scripts with which many of us may have but a limited familiarity.

On the other hand, does the TLD proliferation matter? The answer is, “Perhaps not.” According to W3Techs, about half of all websites use the .com TLD. Next in line is .net, at less than 5%. The first of the “relatively newer” gTLDs, .info, has 1.1% of domain space. The most frequently used of the sTLDs, .mobi (a TLD for mobile devices), occupies 0.1% of domain space (w3techs.com/technologies/overview/top_level_domain/all). Of course, 0.1 % of the estimated 1 billion websites in late 2015 is still a pretty big number. Each website, I should note, consists of from one to many individual webpages and for a variety of reasons impossible to enumerate.

Second, if the findings of Halvorson, et al. in their investigation of the .xxx sTLD can be generalized to other sTLDs, sTLDs may have less impact for the searcher community than one might otherwise think [2]. The .xxx sTLD, as well as .sex and .sexy, was created to offer specific web space for the adult entertainment industry. More than 80% of registrations on the sTLD are for “parked sites” and fail to resolve. Of all .xxx registrations, more than 92% are for “defensive purposes” and do not represent different content.

Analyzing the .biz gTLD, a TLD to compete with and provide additional TLD space for commercial entities, Halvorson and another set of coauthors found significant overlap and redundancy with the more venerable .com gTLD [1]. Most .biz registrations were of a defensive nature to protect domain names registered in .com and other TLDs.

TEMPEST IN A TEAPOT

Does TLD proliferation matter? Is it a tempest in a teapot or meaningful growth? Although it may currently not matter much to information professionals, it could require searchers to look more broadly, incorporating additional TLDs into a search strategy. This proliferation of TLDs does matter for the owners of existing websites and for the holders of a myriad of copyrights and trademarks.

To conclude that sTLDs and ccTLDs of convenience may not have a significant impact for the searcher community today does not mean they will not be important later. Just as website owners must engage in defensive registrations of sTLDs and ccTLDs, web searchers would be wise to be equally defensive when delving into the web.


References

[1] Halvorson, Tristan, Janos Szurdi, Gregor Maier, Mark Felegyhazi, et al., “The BIZ Top-Level Domain: Ten Years Later”; icir.org/vern/papers/dot-biz.pam12.pdf.

[2] Halvorson, Tristan, Kirill Levchenko, Stefan Savage, and Geoffrey M. Voelker, “XXXtortion? Inferring Registration Intent in the .XXX TLD,” International World Wide Web Conference Committee (IW3C2), Seoul, South Korea, April 7–11, 2014; dx.doi.org/10.1145/2566486.2567995.

[3] Internet Live Stats: internetlivestats.com/total-number-of-websites

[4] Koehler, Wallace, “Unraveling the Issues, Actors, and Alphabet Soup of the Great Domain Name Debates,” Searcher, Vol. 7, No. 5, May 1999, pp. 16, 18, 20–26.

[5] Koehler, Wallace, “I Think ICANN: Climbing the Internet Regulation Mountain,” Searcher, Vol. 8, No. 3, March 2000, pp. 49–53.

[6] Koehler, Wallace, “ICANN and the New ‘Magnificent Seven,’” Searcher, Vol. 9, No. 2, February 2001, pp. 56–58; infotoday.com/searcher/feb01/koehler.htm.

[7] Koehler, Wallace, “Dot-Lib for Libraries—Can It Happen? Ask ICANN,” Searcher, Vol. 9, No. 4, April 2001, pp. 66–67; infotoday.com/searcher/apr01/koehler.htm.

[8] Koehler, Wallace, “A Call to Action: What Every Searcher Should Know—And Do—About Domain Names, Standards, and Metadata,” Searcher, Vol. 10, No. 9, October 2002, pp. 22–27.

[9] Koehler, Wallace, “New Domain Name Rules and Perhaps Dot-Lib for Libraries—Redux,” July 11, 2011; newsbreaks.infotoday.com/NewsBreaks/New-Domain-Name-Rules-and-Perhaps-DotLib-for-Libraries-Redux-76516.asp

[10] Notess, Greg R., On the Net: “Internationalization and Expansion of Web Addresses,” ONLINE, Vol. 35, No. 6, November/December 2011, pp. 44–46.

[11] Notess, Greg R., On the Net: “The Top-Level Domain Game,” ONLINE, Vol. 26, No. 1, January/February 2002, pp. 63–65; infotoday.com/online/jan02/OnTheNet.htm

Pages: 1| 2


Wallace Koehler is professor emeritus at Valdosta State University.

 

Comments? Contact the editors at editors@onlinesearcher.net

       Back to top