On Friday 11 June I attended the first ACM Symposium on Cloud Computing, an ACM-sponsored workshop co-located with the ACM SIGMOD conference held in Indianapolis. General chair Joe Hellerstein of UC Berkeley, along with PC co-chairs Surajit Chaudhuri of Microsoft Research and Mendel Rosenblum of Stanford University put together an interesting, thought-provoking program. In this article I want to convey my impressions from one of the SOCC keynote talks, entitled “Inside the Cloud: Salesforce.com’s Multi-tenant architecture” presented by Rob Woolen of Salesforce.com. My notes follow; Rob’s talk basically provided an overview (and update) of an already-published whitepaper  that describes Salesforce.com’s multi-tenant architecture.
Salesforce.com Cloud Services
- Salesforce.com has approximately 72,500 customers, 150K custom applications, 15K+ web sites offering CRM, service and support, and other services that include:
- content management
- mobility infrastructure
- BPM (workflows, approvals)
- Multi-tenant programming language (Apex) with Java-like syntax
- SOQL query language – Salesforce.com query language with object-oriented flavour and support for object instantiation, but SQL-like
- API access to data and metadata
- full-text search
Salesforce.com’s site architecture
Each “user” of Salesforce.com is mapped to an “organization” (ie. a “tenant”) and hence to an “instance” of the Salesforce software stack. That stack includes 8-way Oracle RAC servers running on Linux/RedHat, Resin application servers, Lucene full-text search servers, Java application servers, load balancing servers, and a SAN disk array. Salesforce.com currently has 16 instances in production, supporting 680K “objects” (tables) in their global (Oracle) schema, managed by 8 DBAs.
Service isolation with multi-tenancy
Salesforce.com applications are written in Apex, a customized interpreted procedural language “similar in many respects to Java” . Apex is an interpreted language; at compilation the interpreter rejects “expensive” SOQL statements and resource-intensive programs to protect an instance from rogue processes. At run time the interpreter enforces dynamic limits, prevents infinite loops, and limits heap size, stack depth, and the total number of database records retrieved. Apex also augments SOQL queries to ensure that queries select information only by organization, and contains both static and dynamic validation routines to ensure fine-grained access control.
Multi-tenant query processing in Salesforce.com
Salesforce.com offers applications a “flex” schema – actually implemented as Oracle partitioned tables, with 500
VARCHAR columns. Hence any column can contain data of any type. Additional tables in the schema contain metadata about each organization’s schema, which is used by the Apex interpreter when compiling queries. The flexibility offered by this scheme has obvious issues with respect to strong typing, semantics, and performance (since an index over a
VARCHAR column of mixed data types will fail to yield correct orderings). The former two issues are managed by Apex itself. Query performance issues are overcome by using what Salesforce.com call “accelerator tables”.
Accelerator tables are de-normalized, strongly typed Oracle tables, with appropriate indexes, that replicate the data in the “flex” schema for a particular organization. Once created, any updates to the database are synchronously propagated to any accelerator tables that are extant by the Apex interpreter.
Accelerator tables can be instantiated on-the-fly, in cases where, for example, the Apex interpreter determines that a SOQL query over the flex data will take too long. The metadata for the flex data also contains statistics on data distributions, maintained by the Salesforce.com infrastructure independently from the statistics management performed by the Oracle server. Apex uses these statistics to determine estimates of operation costs, and hence whether or not to create (or destroy) accelerator tables as necessary. With this information, the Apex interpreter can subsequently change the generated SQL query from referencing the flex data to one or more accelerator tables.
The flex schema offers one additional piece of management flexibility: it permits “logical” schema changes for an application without requiring a (real)
ALTER TABLE on the server that can impact (a) server availability and (b) application compatibility. Salesforce.com permits “schema versions” so that applications can seamlessly evolve from one schema to the next.
Salesforce.com’s infrastructure offers an interesting approach to how to build multi-tenant infrastructure software on a software stack composed of both commodity RDBMS servers (Oracle) and specialty toolkits (the Apex language and its interpreter). In particular it is the choice of tradeoffs that I find the most interesting; Salesforce.com has embraced traditional relational database technology for its persistent storage, rather than eschewing it. At the same time, however, Salesforce.com rigorously controls application behaviour through enforcing access via their own tool sets, in order to provide some measure of response time guarantees.