Complaining about relational databases is a staple theme of programmer blogs. Why are so many programmers irritated and frustrated with relational databases? Why do the perceived intricacies of SQL and the “object-relational impedance mismatch” launch so many rants? Why are DBAs more hated than managers? I have some ideas.
Programmers Conflate RDBMSs with SQL
SQL is the standard language for describing, querying, and manipulating data residing in a Relational Database Management System (RDBMS). The deficiencies of SQL, real or imagined, have little to do with relational theory, data integrity, or database management. The SQL language has been criticized by the most distinguished proponents of relational databases. Once you learn relational theory and how (and why) to normalize, SQL’s problems will matter a lot less.
RDBMSs Are Old, Like COBOL and Fortran
Most programmers don’t find databases interesting or fun, so they dismiss databases as “legacy” technology that doesn’t know when it’s time to die. RDBMSs are approximately as old as the C language, and younger than Lisp and Forth, which are trendy again. RDBMSs are still around for two big reasons: There is no serious competing data management technology, and a huge amount of data is committed to RDBMSs. If there was an alternative technology that solved the problems RDBMSs solve so well and offered additional advantages database-based applications would be slowly migrating. Instead we have MySQL and SQLite, low-end open source implementations of 35-year-old technology.
There Are Too Many Incompatible Versions Of SQL
Although SQL is an ANSI standard every vendor has extended the language and implemented parts of the language differently. Oracle’s dialect has some serious deviations from the standard, but then again Oracle and a lot of their installed base pre-date the ANSI standard. The differences between SQL dialects are frequently exaggerated by programmers. In real life converting from one RDBMS to another is such a major undertaking that resolving differences in SQL dialects is only a small part of the problem. Writing a database library that can support multiple RDBMSs is obviously complicated by multiple implementations, but this is a problem with a lot of languages and tools, not specifically an RDBMS or SQL problem.
Who Needs Theory — Just Give Me The Data
Because most of what programmers do has no theoretical basis or system of proof, programmers back away from anything that looks like hard math. Failure to understand and appreciate relational theory has launched countless bad databases (and thousands of witless articles condemning SQL). The idea that there is a right way to design a normalized database–and that the wrong ways are provably wrong–is alien to most programmers. Database management doesn’t come down to a matter of opinion like OOP class hierarchies or coding style.
Most professional programmers know how it feels to see an amateur, unfamiliar with Knuth or any programming book containing equations, implement their own sort routine. That’s how people who understand relational theory feel when they see a badly-designed database. Relational theory and RDBMSs are old and well-established now, so it’s hard not to think a lot of programmers are willfully ignorant.
Relational Databases Don’t Play Nice With My Objects
When OOP took over as the dominant programming paradigm all of the older techniques and technologies were dismissed as bad and unenlightened. In the case of RDBMSs this scorn has a technical name: Object-Relational Impedance Mismatch. RDBMSs can be used as backing store for objects, but mapping object relationships, hierarchies, and references to and from a database is tedious and ugly. A generic code layer to do this (called an Object-Relational Mapping or ORM) is complicated enough, but the incompatibilities of RDBMS vendors and versions (see above) make ORMs really hard to get right. Relational databases are based on sets, not on objects, and relations in a database are based on matching values, not on pointers. There’s definitely an apples vs. oranges problem, but neither technology is necessarily wrong or bad. Object databases may eventually offer a workable solution to the object-relational mismatch, but without a theoretical basis comparable to relational theory it will be hard to make a case for moving valuable data out of RDBMSs.
Some of the most entertaining complaints about the “impedance mismatch” are written by Ruby on Rails advocates–entertaining because by design Rails can’t play nice with legacy databases. The right opinion to have in that community is that legacy databases should be replaced, DBAs should be bypassed by programmers installing MySQL on their own servers, and data integrity should be enforced in every application that uses the data.
Relational Databases Don’t Play Nice With Anything
SQL is a declarative language. Most programmers work with imperative or functional languages, where the code describes how to do something. The SQL SELECT statement may look like a command telling the database which rows (records) to retrieve, but really it’s a description of the relations in the database that satisfy the conditions in the statement. In addition to the imperative/declarative disconnect the data types commonly supported in RDBMSs don’t always map cleanly to the host or calling language. This problem has been around for a long time–OOP just exposed it to a new generation of programmers. One partial solution is embedded SQL, where SQL statements are part of the host language, the types are mapped behind the scenes, and imperative control structures (like loops) can hide RDBMS features like cursors. When tempted to kick RDBMSs out of the playground remember they were probably there first.
Database Administrators (DBAs) Cramp My Style
Companies commit a lot of important and valuable data to RDBMSs, and DBAs are charged with making sure that data is secure, backed up, and protected from programmers deploying broken and untested code. Whatever programmers think of themselves, losing data is just as disastrous as–and frequently more permanent than–a crashing program. Maintaining data integrity is what RDBMSs do, and DBAs are supposed to make sure the RDBMS is allowed to do that. Every DBA I’ve worked with has a lot of stories about programmers losing or corrupting big chunks of valuable data, but programmers mainly complain that DBAs won’t let them do whatever they want. DBAs are called “high priests” and derided as petty tyrants. Some DBAs are incompetent or difficult, but that’s true of programmers, too. A good DBA can design correct, robust databases and help programmers access databases efficiently and safely.
Before flying off the handle about DBAs, consider that the database probably has to support multiple applications, simultaneously, preventing each application from corrupting data or stomping on the other applications. This is a pretty big problem and RDBMSs have offered a workable, scalable solution for a long time.
Recommended Database and SQL Books
- Database in Depth: Relational Theory for Practitioners, C. J. Date
- An Introduction to Database Systems, Eighth Edition, C. J. Date
- Practical Issues in Database Management: A Reference for the Thinking Practitioner, Fabian Pascal
- The Database Relational Model: A Retrospective Review and Analysis : A Historical Account and Assessment of E. F. Codd’s Contribution to the Field of Database Technology, C. J. Date
- Joe Celko’s SQL for Smarties: Advanced SQL Programming Third Edition
- The Art of SQL, Stephane Faroult and Peter Robson