Google vs. Web Standards – Part 1

According to Google’s Company Overview:

“Google’s mission is to organize the world’s information and make it universally accessible and useful.”

Google is arguably the most widely used search engine ever. According to the Nielsen NetRatings from SearchEngine Watch, Google accounted for 46.3% of search engine traffic in November of 2005. We can all safely assume that this figure grows by the day. One of the reasons Google has become such a big player in the search engine market is the fact that it really takes into consideration what each Web page has to offer. It does this using very advanced arithmetic computations and taking into account more than just the content of the page. Google takes into comparison the structure of each document along with the markup used. It looks at the big picture and works very hard to determine the validity and reliability of a particular Web site. The actual structure of the Internet is taken into consideration and each page is compared to those it links and those that link to it. This new strategy towards searching the Web has displayed exponential success, and brought Google to the front of searching the Web.

All of this is great for the end user, who is provided with an extraordinary service free of charge – for the most part. What about those who have a certain disability that makes searching the Web a completely different experience than it is for the majority? The accessibility of Google is something that should be closely analyzed and quickly adjusted to ensure that they work to reflect their mission statement. A corporation as gigantic and widely used as Google should be at the forefront of accessibility and standardization, when in reality they are quite far from it.

Google’s Home Page

Google’s Home page is quite far from ideal when speaking of the semantics of Web. It is based on a tabular layout with bloated code. Rumor has it that the page looks as it does due to the fact that the people behind Google didn’t have a knowledge of HTML when Google first came to be. The question now is, why has the code not been updated? Besides the fact that updating the code would make Google more accessible, they would be able to benefit greatly from bandwidth saving costs.

As of February 2003, Google experienced 250 million searches per day. That means that their home page, weighing in at 3,318 bytes — removing the tabular layout and bloated code would reduce the page to somewhere around 2,394 bytes. Do not forget to take into consideration the fact that a stylesheet would need to be written and would also add to the total bandwidth transferred for the visit. What is important to remember is that the stylesheet is cached and more often than not, only downloaded once. Multiply the ~924 bytes saved by 250 million visits per day and you can imagine the bandwidth saved. This is only taking Google’s home page into consideration. The same idea can be generalized towards a search results page, which for example will have a size on average of 19,830 bytes. Part 2 of Google vs. Web Standards will look more closely into this issue.

Google’s Validity According to the W3C

If you take a look at Google’s home page validity results using the W3C’s Validator, you’ll find 48 errors on that small page alone. This number skyrockets to 239 errors when looking at a results page for “Web Standards”. This directly relates to Google’s negligence when it comes to Web Standards. Google’s Web pages do not even declare a DOCTYPE which is a basis for determining validity. What results is an attempted validation using an HTML 4.01 Transitional DOCTYPE. It can wreak havoc when considering accessibility if you neglect to include a DOCTYPE. When looking more closely at the validation results from the W3C we can observe that the code looks as though it could have been written in Microsoft’s FrontPage 2003 or something of the like. A tabular layout, with bloated, non-semantic markup really works against what Google is trying to offer. There is meaningless CSS scattered throughout the document and it is a real disaster to view the source.

You can state that the actual search results are no longer tabular, which is a vast improvement on how search results were displayed years ago. This is a positive step for Google, but they have a long journey ahead of them if they wish to conform to standards and having a semantic site equally accessible to everyone.

Google’s CSS

Along with Google’s page markup, their CSS is not much to admire either. Meaningless identifiers are the major culprit here. If you look at Google’s CSS you’ll quickly notice that it is all embedded which should not be the case from the start. Upon further inspection you see that you honestly have no idea what aspects of the document that this code is affecting until you look at the two documents either side by side or compare them back and forth. Why would a corporation with such a high level of intelligence code the easiest part of their service in such a way?

Embedded Styles

All of Google’s CSS is embedded. You can view the included CSS blow:

Home Page:

Being that Google’s home page is based on a tabular layout and bloated code, there isn’t too much CSS to embed, but for one reason or another they threw this in:

<style><!--
body,td,a,p,.h{font-family:arial,sans-serif;}
.h{font-size: 20px;}
.q{color:#0000cc;}
//-->
</style>

Search Results:

The search results use a bit more CSS, but as you can see, it isn’t very clean and it is quite hard to determine which styles are controlling specific parts of the document.

<style><!--
body,td,div,.p,a{font-family:arial,sans-serif }
div,td{color:#000}
.f{color:#6f6f6f}
.fl:link{color:#77c}
a:link,.w,a.w:link,.w a:link{color:#00c}
a:visited,.fl:visited{color:#551a8b}
a:active,.fl:active{color:#f00}
.t a:link,.t a:active,.t a:visited,.t{color:#000}
.t{background-color:#e5ecf9}
.k{background-color:#36c}
.j{width:34em}
.h{color:#36c}
.i,.i:link{color:#a90a08}
.a,.a:link{color:#008000}
.z{display:none}
div.n{margin-top:1ex}
.n a{font-size:10pt;color:#000}
.n .i{font-size:10pt;font-weight:bold}
.q:visited,.q:link,.q:active,.q{color:#00c;}
.b{font-size:12pt;color:#00c;font-weight:bold}
.ch{cursor:pointer;cursor:hand}
.sem{display:inline;margin:0;font-size:100%;font-weight:inherit}
.e{margin-top:.75em;margin-bottom:.75em}
.g{margin-top:1em;margin-bottom:1em}
.sm{display:block;margin-top:0px;margin-bottom:0px;margin-left:40px}
-->
</style>

Even as a beginner using CSS you should take a step back when faced with these styles. Creating meaningful identifiers is a standard process and Google should really take an hour of of their day to update what they have going on. When examining this code it is virtually impossible to determine what goes where and why it is there.

Google’s Page Creator

Google has recently opened up yet a new service, titled Page Creator. This service allows Gmail account holders to have their own Web page using Google’s proprietary WYSIWYG editor and Google Web space. It is a given that any neophite to Web page creation should be shown a WYSIWYG editor in this sort of situation. The editor would prove to be quite useful to someone new to Web site creation and have their project up and running in minutes. The major problem with this editor, as with most WYSIWYG editors, is the output. While Google has tried to include CSS as much as possible, the code is far from standardized, and is also quite bloated. Naturally a service like this won’t be used to create many Web pages that will have a strong worth on the Internet, but it shows that while Google may be working to embrace CSS in a more effective manner, they have a lot of work to do.

A Major Problem with Page Creator

Yes, Page Creator uses CSS quite extensively, but if you take a look at a site developed by Page Creator, you’ll notice a big comment block in the CSS stating:

-- -- -- -- -- -- --
Browser Fixes
-- -- -- -- -- -- --

This file uses CSS filtering methods to fix various
layout bugs.

Each of the following three imported files is a
separate, browser-specific CSS file that keeps all
hacks out of the main style sheet.

Over time, as supporting these browsers no longer
remains a priority, cleaning up the hacks is as
easy as deleting the @import statement below, or
simply no longer linking this file from the HTML.

This is a disastrous move. Initially, if someone were to be using Page Creator to develop their page, how on Earth would they know to check for browser compatibility and then think to go back and remove these CSS hacks once ‘the issue has been resolved’? This gives Page Creator a big step back and hopefully this forceful inclusion of CSS hacks is promptly removed. CSS hacks were a big hit for a short time when CSS was really making moves into the mainstream. Since, CSS hacks have been deemed an unintelligent development tactic. There are ways around using CSS hacks that should be given a major priority.

Continuing with the comments of Page Creator, they’re generally quite good once you remove all of the CSS hack mumbo jumbo. Section headers are commented and other details are given also. There seems to be a lot of redundant CSS in each ‘template’ they offer, but for the most part it is organized well.

Keep an eye out for a future article taking an in depth look into Google’s Page Creator once it has become completely open to the public. Since the debut, registration has been on and off according to demand.

Where to go from here?

That is a tough question to answer. Google is at the forefront of the Web, and one of the fastest growing companies in history. What can anyone do about how they operate? I would like to think that a company such as Google would actually listen to what their users have to say. They seem to operate differently than most huge corporations, even if it may seem as though they’re taking over the world one step at a time. Those who feel strongly about the issue should get together and really devote some time to educating Google about their pitfalls, and work to correct the situation. Google is an ever expanding institution, why should they be closed to any suggestions by the people who made them who they are?