By Eric Farrar on October 27th, 2008
I am back from a week in sunny San Jose at the AjaxWorld 2008 conference. The conference is a bit of a hodge-podge of Enterprise RIA, and Web 2.0, with a dash of iPhone development for flavour. I have included some of my thoughts after leaving the conference, as well as some contrasts to last year’s conference:
No mention of Mash-ups
Last year several of the talks (especially the keynotes) focused on enterprise mash-ups. By contrast, this year I hardly heard the word uttered (despite being one of the themes of the conference). I must admit that I was quite interested in the concepts of mash-ups after last year’s conference, but have since become more skeptical. I think that plain old Excel will continue to reign as the great mash-up platforms for some years to come.
Struggle between server-centric and client-centric architectures
For those who gave talks pontificating for a platform solution to web development, there was a very clear divide between those who favoured a client-centric approach, and those who favoured a server-centric approach. In reality, I would say that neither is clearly superior to the other in every case. As I have oft heard Glenn say (quoting a professor of his), “There are no right answers, only tradeoffs.” In a future post, I will go into a bit more detail on what I think those tradeoffs are.
Choices abound at every turn
There is no shortage of choices in RIA development, and “To plugin, or not to plugin” will be among the first. In the plugin arena there is a host of relatively new heavyweight contenders including Microsoft SilverLight, Adobe AIR, and JavaFX (although it was hardly mentioned at the conference). In the non-plugin camp, many speakers talked about the proliferation of JavaScript libraries available (such as jQuery, Prototype, and Script.aculo.us). Similar to my previous point, there are advantages and tradeoffs associated with each of them, and in my opinion, there was no obvious choice among them.
The database is being increasingly ignored and abstracted
I was a little surprised at how little mention databases received in the talks. Even the keynotes by Microsoft and Oracle made only passing references to the database. In reality, for most websites (especially enterprise ones) the database will be a very central piece of the application. Coupled with this is the trend towards object-relational mappers to wrap the database. Although from a database purist point of view ORMs seem to discard years of database research and development, there are very compelling reasons to use them (especially in the web space). Again, I will be going into more detail on this in a future post.
Posted in: Conferences
By Eric Farrar on October 10th, 2008
While browsing around the internet, your browser makes many requests for many different types of content. The content that is returned may be an image, a sound file, a SWF file, or hundreds of other things. The browser relies on the web server that’s returning the data to tell it what it’s receiving, and how it should be interpreted and displayed.
This meta information is called the document’s MIME type (Multipurpose Internet Mail Extensions). The standard was first developed for e-mail (hence the word ‘Mail’ in the acronym), but has since extended to HTTP. However with browsers, it is often called the Content Type. This is because the Content-Type HTTP header is the vehicle used to convey the MIME type to the browser. In Firefox if you right-click on this page is choose ‘View Page Info’, you will see that the Content-Type of this page is text/html.
This explains why in our first example, we had to specifically set the Content-Type to let the browser know what we were sending it. This may seem odd if you have worked with other web servers such as Apache. You may have used Apache to host lots of types of content and never had to set any MIME types. This is because Apache is doing all the MIME work for you. If you look in your Apache configuration directory, you will find a file called mime.types that contains entries that look like this:
1
2
3
4
| image/bmp bmp
image/gif gif
image/jpeg jpeg jpg jpe
... |
This is Apache’s master list to convert a URL’s extension into a MIME type. If you want to change a mapping you can either edit this file directly, or use the mod_mime module. All Apache is doing under the covers is extracting the extension, and looking it up in mime.types. The MIME table in Microsoft IIS works in a similar way. Looking up things in a table is just what databases do well, so let’s build an automatic Content-Type setter for SQL Anywhere web services.
First, we need a table to hold the extensions dictionary:
1
2
3
4
| CREATE TABLE MIMEType (
"Extension" CHAR(255) NOT NULL PRIMARY KEY,
"Type" CHAR(255) NOT NULL
); |
Second, we need to make a procedure that extracts the extension from the URL, looks it up in the table, and sets the Content-Type header
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| CREATE PROCEDURE AutoSetMIMEType()
BEGIN
DECLARE loc INTEGER;
DECLARE ext CHAR(255);
DECLARE mimetype CHAR(255);
-- Find the location of the last period in the URI
SET loc = LOCATE(HTTP_Header('@HttpURI'), '.', -1);
-- If no period exists, default to ‘text/plain’
IF loc > 0 THEN
-- Extract the extension and convert to lower case
SET ext = LOWER(SUBSTR(HTTP_Header('@HttpURI'), loc + 1));
-- If the extension exists, use the corresponding type
-- Otherwise default to ‘text/plain’
IF EXISTS(SELECT Extension FROM MIMEType WHERE Extension = ext) THEN
SELECT Type INTO mimetype FROM MIMEType WHERE Extension = ext;
CALL sa_set_http_header('Content-Type', mimetype);
ELSE
CALL sa_set_http_header('Content-Type', 'text/plain');
END IF;
ELSE
CALL sa_set_http_header('Content-Type', 'text/plain');
END IF;
END; |
Now all that is left to do is add Extension/Type pairs to the table like this:
1
2
3
4
5
6
| INSERT INTO MIMEType VALUES('bmp', 'image/bmp');
INSERT INTO MIMEType VALUES('gif', 'image/gif');
INSERT INTO MIMEType VALUES('jpeg', 'image/jpeg');
INSERT INTO MIMEType VALUES('jpg', 'image/jpeg');
INSERT INTO MIMEType VALUES('jpe', 'image/jpeg');
... |
I have compiled a list of about 600 common MIME Types (based off mime.types) that you can cut and paste into Interactive SQL. You can get that SQL file here.
To use the function, all we have to do is add
1
| CALL AutoSetMIMEType(); |
Somewhere in our web service handler, and the MIME type will be automatically set from the extension, just like it is in Apache and IIS.
As a side note, there are actually some more sophisticated ways to determine the MIME type. One example is the mod_mime_magic module for Apache that looks at the first few bytes of the content and makes a guess on what the type is (for example, does it look like a BMP header?, an XML header?, etc…). While useful, it is recommended that this should only be used as a “second line of defense“, and not the primary strategy.
Posted in: Practical
By Eric Farrar on September 26th, 2008
I just found out that my abstract for AjaxWorld 2008 in San Jose has been accepted. I will be speaking on Tuesday, October 22nd at 9:20 about object-relational mapppers (ORMs) and the database. The talk is titled It’s 11 p.m., do you know where your queries are?:
Object-relational mappers such as Hibernate, LINQ, and Rail’s ActiveRecord can greatly simplify creating database-backed web applications. These tools do such a good job of abstracting the database, that is possible to create very complex web applications without ever considering the database at all!. Unfortunately, ignorance is not always bliss. When naively programmed, seemingly trivial application code can cause inefficient queries at best, and scalability and concurrency nightmares at worst. This talk will show that while ORM’s provide tremendous power and speed, it is still very important to know what is going underneath the comfortable abstraction layer the ORM provides. Examples will include identifying client-side joins, concurrency problems, and scalability issues.
Posted in: Conferences
By Eric Farrar on September 23rd, 2008
In a first part of this post we discussed how the SQL Anywhere HTTP Server routes web requests. In brief, when a web request enters the web server, the web server will route the request to the service that most specifically identifies it. This post explains with how you can use the remaining part of the URL that does not match. In short, if we have defined a service,
1
2
3
4
5
6
| CREATE SERVICE "alice/files"
AUTHORIZATION OFF
USER DBA
TYPE ‘RAW’
URL OFF
AS CALL get_file(); |
and we try to access the document http://www.host.com/alice/files/business/overview, how does our service use the remainder of the URL (the 'business/overview' part in this case)?
The answer lies in the URL parameter of the service definition. The URL parameter can take three values, and the default value is URL OFF. With URL OFF, the service will not be used unless the URL identically matches the service name. So this case, not only can we not use the remaining part of the URL, but the service simply won’t work.
To allow access to the remaining part of the URL, we have two options. The first is to set URL ON. When URL is ON, a variable called :url is automatically created that contains the remaining part of the URL. Let’s say we altered the service to be:
1
2
3
4
5
6
| ALTER SERVICE "alice/files"
AUTHORIZATION OFF
USER DBA
TYPE ‘RAW’
URL ON
AS CALL get_file(:url); |
Now we have access to the :url variable that contains 'business/overview‘, and we can pass that to the stored procedure that handles this service. The drawback to URL ON is that even if the remaining URL contains many elements, it will still only be captured as a single string. This is where the URL ELEMENTS is useful.
URL ELEMENTS works differently. Instead of creating a single variable called :url that contains the whole string, it creates many variables each representing a part of the URL. These variables are called :url1, :url2, …, :url10. It will create as many variables as there are parts to the URL up to a maximum of 10.
Suppose your application uses RESTful services and the URLs to browse your product catalog all are of the form:
/[DEPARMENT]/[PRODUCT]/[SIZE]
A sample URL for this application might be:
www.host.com/Catalog/Mens/Polo_Shirt/Large
If you defined a service called Catalog and set URL ON, the :url variable would contain the entire string 'Mens/Polo_Shirt/Large‘. By using URL ELEMENTS, you instead end up with three strings:
:url1 = ‘Mens’
:url2 = ‘Polo_Shirt’
:url3 = ‘Large’
These string can then be passed into the stored procedure that handles the service as shown below.
1
2
3
4
5
6
| CREATE SERVICE "Catalog"
AUTHORIZATION OFF
USER DBA
TYPE ‘RAW’
URL ELEMENTS
AS CALL get_price(:url1, :url2, :url3) |
Posted in: Practical
By Eric Farrar on September 16th, 2008
I am at the ZendCon 2008 PHP show this whole week. I will be at the Sybase iAnywhere booth all day on Tuesday and Wednesday. Drop by and pick up and Sybase iAnywhere Remote Control Car, Web Edition CD, and have a chat with me.
On Thursday, I will be giving a talk entitled, Taking It All Offline:
Enterprise applications developed using PHP are getting better every day. They are continually becoming more secure, better performing, and more scalable. However, all of these applications can only be used when a network connection is available. This requirement prevents PHP applications from working in an occasionally-connected model. New browser plugin technologies such as Gears allow applications to run offline, but require the entire application be written in JavaScript (allowing little-or-no PHP code reuse). This talk will examine how SQL Anywhere can help solve this problem, and take your current PHP application offline by locally hosting, managing, serving, and synchronizing your PHP application and data with your current database.
Posted in: Conferences
By Eric Farrar on September 12th, 2008
When a request enters a web server, the web server must go through a process to determine where the request should be sent, and what program should handle it. Basically, the web server must translate the logical URL into an absolute path. This process is called routing.
If you are using Apache, the default behavior is to route the request by appending the URL to the DocumentRoot defined in the Apache configuration file. For example, if the DocumentRoot is /usr/web, the following requests will be routed as:
http://www.host.com/index.htm -> /usr/web/index.htm
http://www.host.com/alice/index.htm -> /usr/web/alice/index.htm
http://www.host.com/alice/files/index.htm -> /usr/web/alice/files/index.htm
http://www.host.com/alice/history/index.htm -> /usr/web/alice/history/index.htm
In this case, the routing is very simple since all URLs will be routed to the same place. But what if Alice’s files and history are in different places? The requests will have to be routed differently. In Apache, you would typically use the mod_alias module to accomplish this. Mod_alias allows you to define routing rules that will effectively make different DocumentRoots for each request. If you added the following to your Apache configuration file:
Alias /alice/files /home/alice/projects/files
Alias /alice/history /home/alice/history
Alias /alice /home/alice/www
The resultant routes would be:
http://www.host.com/index.htm -> /usr/web/index.htm
http://www.host.com/alice/index.htm -> /home/alice/www/index.htm
http://www.host.com/alice/files/index.htm -> /home/alice/projects/files/index.htm
http://www.host.com/alice/history/index.htm -> /home/alice/history/index.htm
When using mod_alias, it is important to make sure that the most specific aliases are defined first. Mod_alias works by stepping through the alias list and using the first rule that matches. If we had defined the /alice alias before the /alice/file alias, all request though http://www.host.com/alice/files would have actually been routed to /home/alice/www/files.
If no aliases match, Apache will use the default DocumentRoot to create the absolute path.
For another example, let’s take a look at the routing of the Google App Engine. On that server, routing is determined by the handlers section of the app.yaml configuration file. To do similar routing to what we did in Apache, the file would look like:
handlers:
- url: /alice/files
static_dir: home/alice/projects/files
- url: /alice/history
static_dir: home/alice/history
- url: /alice
static_dir: home/alice/www
- url: /*
script: home.py
Similar to Apache, the Google App Engine evaluates the routes in order and will use the first one that matches.
So how does routing work in SQL Anywhere? In SQL Anywhere, every service that is created acts as an alias. So the equivalent routing in SQL Anywhere would be created like this:
1
2
3
4
5
6
7
8
9
10
11
| CREATE SERVICE “root”
AS … ;
CREATE SERVICE “alice”
AS … ;
CREATE SERVICE “alice/files”
AS … ;
CREATE SERVICE “alice/history”
AS … ; |
The root service is the same as the default DocumentRoot in Apache, or the url: /* rule in the Google App Engine. It will be used to route when no other services match. This is why I said in my previous post that the root service will respond to anything. However, unlike both Apache and the Google App Engine, SQL Anywhere will automatically choose the most specific service, so the order of creation does not matter.
If you are wondering how we use the rest of the URL in the services we have just defined (that is, how http://www.host.com/alice/files will become /home/alice/projects/files), stay tuned for part 2.
Posted in: Practical
By Eric Farrar on September 8th, 2008
Now that we have everything defined, we can start to look at what is actually going on behind the browser. To do this, we will need some software. First off, we will need a browser (and that choice has become more interesting in the last week with the appearance of Google Chrome), but for now any browser should do.
We also will need a web server to respond to the browser’s requests, and a database or file system to actually hold the content. SQL Anywhere has a built-in HTTP server, it is a database server, and it has access to the file system. As a result, for now at least, all we will need to produce a simple web page is SQL Anywhere. The free SQL Anywhere Web Edition will work well for this example, and can be downloaded here. I have also added a permanent download button to the sidebar on the right.
The Hello World program is a classic, so it only seems fitting that we should start with it. First, we need to create a SQL Anywhere database.
Now that the database file has been created, we need to start the database and HTTP server. The HTTP Server is not started by default when the database is started, so we need to add the -xs switch to enable it. There are lots of options for the HTTP server, but for now we have only specified that we want to start it on port 8080. You can change this to use any free port.
1
| dbeng11 hello.db -xs http(port=8080) |
After the database has started, start up DBISQL. We will use DBISQL to execute the SQL statements we need to create our page.
1
| dbisql -c "eng=hello;uid=dba;pwd=sql" |
The first thing we will define is a service. A service can be thought of as an endpoint that responds to HTTP requests. Subsequent posts will go into a lot more detail on how exactly the requests are routed. For now, simply know that a service named root will respond to anything. You can ignore most of the other parameters for now. All you need to know is that we have defined a HTTP endpoint that responds by returning the results of calling the hello_world stored procedure.
1
2
3
4
5
6
| CREATE SERVICE "root"
TYPE 'RAW'
AUTHORIZATION OFF
USER DBA
URL OFF
AS CALL hello_world(); |
Next, then, we had better create that stored procedure! The stored procedure only does two things:
- Set an HTTP header called Content-Type (more on this later)
- Returns a block of HTML text representing our Hello, World! page
1
2
3
4
5
6
7
8
9
10
11
12
13
| CREATE PROCEDURE hello_world()
BEGIN
CALL sa_set_http_header('Content-Type', 'text/html');
SELECT
'<HTML>
<HEAD>
<TITLE>Hello, World!</TITLE>
</HEAD>
<BODY>
<H1>Hello, World!</H1>
</BODY>
</HTML>';
END; |
Done! Start up your browser and point it at http://localhost:8080.

Posted in: Practical
By Eric Farrar on August 28th, 2008
In my last blog entry, I dealt with defining the ‘browser’ part of my blog’s title. It only seems fitting that I also talk about the other half of the title and answer the question, “What is it we are hoping to see?” The answer: When a browser displays data, where does that data live, and how does it get there?
For a lot of websites, the answers may be quite simple. The data actually appears to come from a web server that handles HTTP requests. The web server, in turn, invokes a server-side scripting language such as PHP or ASP.net to talk to a database server. The data permanently lives in that database server. The database server and web server are either hosted internally, or hosted at an ISP. This simple setup describes how the majority of web sites operate today. So what more is there to look at?
Lots. Developments in the last year have added a whole range of possibilities to this simple picture. One example of this is the variety of new hosting options available. Previously, hosting a website at an ISP would often involve sharing both machine and database with other users. It was very rare that you could have root SSH access to your machine. Now, with cloud-based solutions such as Amazon EC2 that host virtual machines images, a perfectly legitimate answer to the question, “Where is your data living?” may be, “Dunno…somewhere in the world…it is in the cloud somewhere.”
Even if you don’t know where the data is actually living, you could argue that from the browser’s perspective that data still appears to come from the same place. Have there been any developments recently that change this? Absolutely.
The one that pops most readily to mind is Gears, an open source project focused on adding extra functionality to browsers. Although many ‘gears’ exist now, the project’s initial offerings were focused on storing data locally in the browser, introducing the possibility of offline web applications. In addition to creating offline applications, developers soon discovered that even online application could benefit from having a quick, local data store. Currently both MySpace and Wordpress use Gears, not for offline access, but to enhance their online experience.
The question of where the data lives and how it gets there is bound to get more interesting as time goes on. Let’s peer behind the browser, and take a look…
Posted in: Technology
By Eric Farrar on August 21st, 2008
If I am going to write a blog titled Peering Behind the Browser, I had better start by defining what exactly it is I mean by a browser. So perhaps the title begs the simple question, “What is a browser?” The search for the answer, we will see, is actually not so simple.
The knee-jerk reaction to the question might be, “Browsers are programs that display web pages like Firefox, Internet Explorer, and Safari.” This brings to mind a similar interaction in Plato’s Theaetetus. In that dialogue, Socrates asked Theaetetus to define ‘knowledge’. Theaetetus replied that knowledge is things that could be learned like geometry, cobbling, and trades. Socrates criticized this answer, pointing out that although those are examples of knowledge, they don’t actually describe what knowledge is. Similarly, the programs listed above are examples of browsers, but fail to explain what one is.
So let’s broaden our definition and say it is a program that remotely accesses data through a network, and renders HTML, JavaScript, and CSS for display to the user. This certainly describes the vast majority of things that a browser does, but does it explain them all? Most people would agree that when they watch a video on YouTube, they are watching it in a browser. However, the underlying technology that YouTube is using is Adobe Flash. If you use a stand-alone Flash player to watch that same movie, does that make the stand-alone player a browser?
This actually becomes far more interesting with the recent offerings of RIA technologies in the last year. Adobe AIR allows exactly these types of stand-alone applications. The ebay Desktop is a perfect example. It uses AIR to provide a desktop-like experience for browsing your ebay account, but displays the same information available at www.ebay.com. Does this make ebay Desktop a browser? To muddy the waters more, AIR applications can be coded in HTML and CSS!
The last data point we will review is JavaFX, an RIA technology produced by Sun. It can allow running Java Applets to be dragged and dropped between programs such as Firefox and Internet Explorer, and the desktop. So if the applet was a browser application while in Firefox, is it still a browser application after it is dragged out and running on the desktop?
I don’t actually intend to answer the question, so in a sense I have done no better than Theaetetus did. But in this blog we will be considering all of these technologies as browsers, and we will be peering behind them all to see what lies underneath.
Posted in: Technology