As web agents (including browsers) become more diverse there is an increasing need to distinguish between their types. The User-Agent header can be used for this task, but requires the server to know in advance all the possible agents and what type they are. This is not possible as both the diversity and quantity of agents is growing too quickly for any single registry to track.
According to the HTTP specification, the Accept header can be used to determine the type of agent. For example:
• HTML browsers should include "text/html" within the Accept header,
• XHTML browsers include "application/html+xml",
• RDF browsers include "application/rdf+xml",
• XSLT agents include "application/xml",
• PDF agents include "application/pdf",
• Office suites include "application/x-ms-application" or "application/vnd.oasis.opendocument", and
This allows the server to better redirect the agent to an appropriate resource.
Obviously, if a service will only serve HTML browsers, the type of agent is not necessary, as is the case in the Web 1.0 days when everything on the Web was HTML. However, as HTTP is becoming a more popular protocol for non-HTML communication, the need for distinguishing between types of agents is becoming important.
This works very well in theory, but because the Web was built with only HTML browsers in mind, most browsers don't properly implement the HTTP specification (because they don't have to). Even worse is that most non-HTML browser agents either don't include an Accept Header at all or use */* and say nothing about the type of agent. Below are some of the default accept headers from popular user agents on the web.
FF3.5 is an HTML and XHTML browser first, XML/XSLT agent second
IE8 is a media viewer (apparently)
image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, */*
IE8+office is a media viewer and office suite
image/gif, image/jpeg, image/pjpeg, application/x-ms-application,
Chrome3 is an XHTML and XML/XSLT agent first, HTML browser second, and text viewer third.
Safari3 is an XHTML and XML/XSLT agent first, HTML browser second, and text viewer third.
Opera10 is an HTML and XHTML browser first, XML/XSLT agent second.
text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
The MSN bot is an HTML browser, text viewer, xml client and application archiver.
text/html, text/plain, text/xml, application/*, Model/vnd.dwf, drawing/x-dwf
Google search bot is a jack of all agents, master of none
Yahoo search bot is a jack of all agents, master of none
AppleSyndication is a jack of all agents, master of none
Unacceptable Browser HTTP Accept Headers (Yes, You Safari and Internet Explorer)
WebKit Team Admits Error, Downplays Importance, Re: 'Unacceptable Browser HTTP Accept Headers'