Google vs. Web Standards – Part 1

Posted: February 26, 2006 Comments

According to Google’s Company Overview:

“Google’s mission is to organize the world’s information and make it universally accessible and useful.”

Google is arguably the most widely used search engine ever. According to the Nielsen NetRatings from SearchEngine Watch, Google accounted for 46.3% of search engine traffic in November of 2005. We can all safely assume that this figure grows by the day. One of the reasons Google has become such a big player in the search engine market is the fact that it really takes into consideration what each Web page has to offer. It does this using very advanced arithmetic computations and taking into account more than just the content of the page. Google takes into comparison the structure of each document along with the markup used. It looks at the big picture and works very hard to determine the validity and reliability of a particular Web site. The actual structure of the Internet is taken into consideration and each page is compared to those it links and those that link to it. This new strategy towards searching the Web has displayed exponential success, and brought Google to the front of searching the Web.

All of this is great for the end user, who is provided with an extraordinary service free of charge – for the most part. What about those who have a certain disability that makes searching the Web a completely different experience than it is for the majority? The accessibility of Google is something that should be closely analyzed and quickly adjusted to ensure that they work to reflect their mission statement. A corporation as gigantic and widely used as Google should be at the forefront of accessibility and standardization, when in reality they are quite far from it.

Google’s Home Page

Google’s Home page is quite far from ideal when speaking of the semantics of Web. It is based on a tabular layout with bloated code. Rumor has it that the page looks as it does due to the fact that the people behind Google didn’t have a knowledge of HTML when Google first came to be. The question now is, why has the code not been updated? Besides the fact that updating the code would make Google more accessible, they would be able to benefit greatly from bandwidth saving costs.

As of February 2003, Google experienced 250 million searches per day. That means that their home page, weighing in at 3,318 bytes — removing the tabular layout and bloated code would reduce the page to somewhere around 2,394 bytes. Do not forget to take into consideration the fact that a stylesheet would need to be written and would also add to the total bandwidth transferred for the visit. What is important to remember is that the stylesheet is cached and more often than not, only downloaded once. Multiply the ~924 bytes saved by 250 million visits per day and you can imagine the bandwidth saved. This is only taking Google’s home page into consideration. The same idea can be generalized towards a search results page, which for example will have a size on average of 19,830 bytes. Part 2 of Google vs. Web Standards will look more closely into this issue.

Google’s Validity According to the W3C

If you take a look at Google’s home page validity results using the W3C’s Validator, you’ll find 48 errors on that small page alone. This number skyrockets to 239 errors when looking at a results page for “Web Standards”. This directly relates to Google’s negligence when it comes to Web Standards. Google’s Web pages do not even declare a DOCTYPE which is a basis for determining validity. What results is an attempted validation using an HTML 4.01 Transitional DOCTYPE. It can wreak havoc when considering accessibility if you neglect to include a DOCTYPE. When looking more closely at the validation results from the W3C we can observe that the code looks as though it could have been written in Microsoft’s FrontPage 2003 or something of the like. A tabular layout, with bloated, non-semantic markup really works against what Google is trying to offer. There is meaningless CSS scattered throughout the document and it is a real disaster to view the source.

You can state that the actual search results are no longer tabular, which is a vast improvement on how search results were displayed years ago. This is a positive step for Google, but they have a long journey ahead of them if they wish to conform to standards and having a semantic site equally accessible to everyone.

Google’s CSS

Along with Google’s page markup, their CSS is not much to admire either. Meaningless identifiers are the major culprit here. If you look at Google’s CSS you’ll quickly notice that it is all embedded which should not be the case from the start. Upon further inspection you see that you honestly have no idea what aspects of the document that this code is affecting until you look at the two documents either side by side or compare them back and forth. Why would a corporation with such a high level of intelligence code the easiest part of their service in such a way?

Embedded Styles

All of Google’s CSS is embedded. You can view the included CSS blow:

Home Page:

Being that Google’s home page is based on a tabular layout and bloated code, there isn’t too much CSS to embed, but for one reason or another they threw this in:

<style><!--
body,td,a,p,.h{font-family:arial,sans-serif;}
.h{font-size: 20px;}
.q{color:#0000cc;}
//-->
</style>

Search Results:

The search results use a bit more CSS, but as you can see, it isn’t very clean and it is quite hard to determine which styles are controlling specific parts of the document.

<style><!--
body,td,div,.p,a{font-family:arial,sans-serif }
div,td{color:#000}
.f{color:#6f6f6f}
.fl:link{color:#77c}
a:link,.w,a.w:link,.w a:link{color:#00c}
a:visited,.fl:visited{color:#551a8b}
a:active,.fl:active{color:#f00}
.t a:link,.t a:active,.t a:visited,.t{color:#000}
.t{background-color:#e5ecf9}
.k{background-color:#36c}
.j{width:34em}
.h{color:#36c}
.i,.i:link{color:#a90a08}
.a,.a:link{color:#008000}
.z{display:none}
div.n{margin-top:1ex}
.n a{font-size:10pt;color:#000}
.n .i{font-size:10pt;font-weight:bold}
.q:visited,.q:link,.q:active,.q{color:#00c;}
.b{font-size:12pt;color:#00c;font-weight:bold}
.ch{cursor:pointer;cursor:hand}
.sem{display:inline;margin:0;font-size:100%;font-weight:inherit}
.e{margin-top:.75em;margin-bottom:.75em}
.g{margin-top:1em;margin-bottom:1em}
.sm{display:block;margin-top:0px;margin-bottom:0px;margin-left:40px}
-->
</style>

Even as a beginner using CSS you should take a step back when faced with these styles. Creating meaningful identifiers is a standard process and Google should really take an hour of of their day to update what they have going on. When examining this code it is virtually impossible to determine what goes where and why it is there.

Google’s Page Creator

Google has recently opened up yet a new service, titled Page Creator. This service allows Gmail account holders to have their own Web page using Google’s proprietary WYSIWYG editor and Google Web space. It is a given that any neophite to Web page creation should be shown a WYSIWYG editor in this sort of situation. The editor would prove to be quite useful to someone new to Web site creation and have their project up and running in minutes. The major problem with this editor, as with most WYSIWYG editors, is the output. While Google has tried to include CSS as much as possible, the code is far from standardized, and is also quite bloated. Naturally a service like this won’t be used to create many Web pages that will have a strong worth on the Internet, but it shows that while Google may be working to embrace CSS in a more effective manner, they have a lot of work to do.

A Major Problem with Page Creator

Yes, Page Creator uses CSS quite extensively, but if you take a look at a site developed by Page Creator, you’ll notice a big comment block in the CSS stating:

-- -- -- -- -- -- --
Browser Fixes
-- -- -- -- -- -- --

This file uses CSS filtering methods to fix various
layout bugs.

Each of the following three imported files is a
separate, browser-specific CSS file that keeps all
hacks out of the main style sheet.

Over time, as supporting these browsers no longer
remains a priority, cleaning up the hacks is as
easy as deleting the @import statement below, or
simply no longer linking this file from the HTML.

This is a disastrous move. Initially, if someone were to be using Page Creator to develop their page, how on Earth would they know to check for browser compatibility and then think to go back and remove these CSS hacks once ‘the issue has been resolved’? This gives Page Creator a big step back and hopefully this forceful inclusion of CSS hacks is promptly removed. CSS hacks were a big hit for a short time when CSS was really making moves into the mainstream. Since, CSS hacks have been deemed an unintelligent development tactic. There are ways around using CSS hacks that should be given a major priority.

Continuing with the comments of Page Creator, they’re generally quite good once you remove all of the CSS hack mumbo jumbo. Section headers are commented and other details are given also. There seems to be a lot of redundant CSS in each ‘template’ they offer, but for the most part it is organized well.

Keep an eye out for a future article taking an in depth look into Google’s Page Creator once it has become completely open to the public. Since the debut, registration has been on and off according to demand.

Where to go from here?

That is a tough question to answer. Google is at the forefront of the Web, and one of the fastest growing companies in history. What can anyone do about how they operate? I would like to think that a company such as Google would actually listen to what their users have to say. They seem to operate differently than most huge corporations, even if it may seem as though they’re taking over the world one step at a time. Those who feel strongly about the issue should get together and really devote some time to educating Google about their pitfalls, and work to correct the situation. Google is an ever expanding institution, why should they be closed to any suggestions by the people who made them who they are?

Get my newsletter

Receive periodic updates right in the mail!
  • This field is for validation purposes and should be left unchanged.

Comments

  1. Google is definately one of the largest search engines out there. I am a firm believer that great sites should promote great sites. A search engine is a tool for listing the best and most powerful sites first. Google should have thier standards set really high to help give back and educate thier users on quality websites. The task of updating thier homepage would take so little effort that I’m shocked they didn’t do it when thier employee count went above 10.

  2. I agree and would absolutely love to see Google put themselves to the test and raise their standards in everything up to that “Google level of efficiency.”

    Anyways, now that you’ve done a successful job in pointing out some of Google’s flaws, I would absolutely love to see some of your prospective solutions in Part 2 of this article.

  3. Thanks for taking the time to read through this article.

    @Michael — I absolutely hope to come up with some really plausible solutions to help with the situation discussed in this article. It is my hope that the second part of the article will expose some effective ways to improve an already exceptional service in hopes of making it that much more accessible, usable, and efficient.

  4. I hope that when Part 2 of this article is finished, we can link it to Google personnel. I would be really excited if monday by noon would help Google change the style they use and promote healthy code.

  5. Wow, after hitting view source on their homepage, you can already see something strange before even looking at the code. 17 lines of code, with a huge horizontal scrollbar = scary stuff. And the amount of validation errors on the homepage pales in comparison to the main news page. Over 1300 errors there –
    http://validator.w3.org/check?verbose=1&uri=http%3A//news.google.ca/nwshp%3Fhl%3Den%26tab%3Dgn%26q%3D

    The validation error page has been loading for many minutes now. I have never seen anything like that before.

    I find it quite ridiculous that an organization such as google has not done something about this already. Maybe it is something that should be brought to the attention of a larger body such as WASP or the Web Standards Project http://www.webstandards.org/ .

    An article I found bookmarked relates pretty well to this discussion. Almost three years old, although it seems to be just as relevant today as it was in ’03. Maybe after this series is over, you can start on the Yahoo version.

    http://www.stopdesign.com/log/2003/04/09/yahoo_rebuilt.html

  6. @astridas: I agree — I think thats a great idea — hopefully Part 2 will be more of a conversation piece because I’d love to get as much feedback regarding possible solutions for Google as possible.

    @Mike: I too found it surprising to see that such a more or less simple layout was coded in such a way. I’m not trying to claim superiority by any means, just bringing up what I’ve noticed, you know? I’ve updated the stylesheet for the site so that long url’s are basically truncated — I’m going to look into a more effective solution to implement for long URLs such as that.

    Thanks to everyone for the great posts — keep them coming!

  7. Just wanted to mention: Google page creator is ok until you change the font of text… it puts FONT tags in the markup. The second I saw this I was done with Google page creator. I sent them an e-mail requesting that they please NOT use the XHTML doctype if they don’t plan on making the code valid and I never received a response.

  8. @Christian: Yes, for the most part Page Creator is definately a step forward in the WYSIWYG method of creating pages. It does do a pretty good job for the most part, but you are correct — it is the simple things like throwing in font tags that bring down its value. Thanks for taking the time to comment!

  9. Dude, I think you’re being a bit of a [****] here. Yes, Google isn’t fully standards-compliant, but this is probably due to the fact that about 10% of the browsers on the internet *are*. Google is a public service, and as such it does need to be accessible to the blind, but also to users on IE4 and IE5.

    Accessiblity is *not* synonymous with CSS. You can kludge up a site pretty well with HTML/CSS, just as you can with tables. However, load up Lynx and check out Google if you think it’s really that bad. In pure text-only mode (which screen readers see), the site works just fine.

  10. @Paul: Firstly, no need for name calling. I had stated in the article that it is simply an observation of mine and nothing more. I’m not trying to claim superiority here or anything of the like. Your comment as a whole seems as though you have a strong opinion and knowledge of the subject, but using such language degrades the validity from the start, which is unfortunate.

    I don’t feel that accessiblity has much to do directly with CSS at all. I understand that standards compliancy is not quite common ground yet — but that is where things are going. As far as 10% of browsers being standards compliant, I believe that is far off. The most widely used browser may not be standards compliant, but its next major release seems to be working towards that.

    It is great that Google seems to work fine using Lynx, but if you were a first time blind visitor to Google, not knowing what it was, would the purpose of the site be obvious? While it may seem to work fine, it still fails multiple times when validating its accessibility.

    Please see the following link:
    Cynthia Says Report on Google.com

    My point of this article once again was not to proclaim superiority by any means. I’m glad you felt compelled to comment on the article, but in future instances please refrain from such language. Thanks for posting.

  11. […] It’s a shame that Google stuck with their usual practices by ignoring standard and valid markup. It seems a bit hypocritical to me that a search service that concentrates on accessibility features is poorly marked up. At least the search page isn’t using a tabular layout, but font tags? Google can do better than that. […]

Leave a Reply

Your email address will not be published. Required fields are marked *