Apache’s Hadoop is getting a lot of traction amongst academics and commercial data warehouse vendors, and it may come to pass — relatively quickly — that aspects of Hadoop’s parallel-processing capabilities will be integrated with traditional relational database systems. That integration may occur with the help of the integration of Yahoo!’s Pig.
So where is all this headed? In the short term, the churn in the marketplace should drive a much faster pace of innovation than traditional database vendors provided over the last decade. The technical advantages of Hadoop are not intrinsically hard to replicate in a relational database engine; the main challenge will be to manage the expectations of database users when playing tricks like trading off data integrity for availability on certain subsets of the database. Greenplum and Aster will undoubtedly push to stay one step ahead of the bigger database companies, and it would not surprise me to see product announcements on this topic from the more established database vendors within the year.
Thanks to my colleague Bruce Hay for sending this my way.