[Note from Pinal]: This is a 94th episode of Notes from the Field series. When I read the title of this article – I was extremely intrigued with it – Rewriting Database History! When I was a kid, history was my favorite subject and till today when I have to deal with history, I always jump first to read and enjoy it. When I see this article from Kevin, I was so delighted. I started to read it immediately, and I did not stop reading it till it was finished. It was 20 minutes well spent. The journey starts from ORM and end at the ORM, however, when it comes to full circle there are so many things in between. Trust me Kevin can be a great historical novel writer.
In this episode of the Notes from the Field series database expert Kevin Hazzard explains the interesting subject of rewriting database history. Read the experience of Kevin in his own words.
Way back in June of 2006, my friend Ted Neward wrote a very detailed piece called The Vietnam of Computer Science. Juxtaposing America’s involvement in Southeast Asia in the last century with the battles we fight in modern, business-driven enterprises, that article examined the unwinnable war that Object-Relational Mapping (ORM) tools have become. The title of the article is probably a bit politically incorrect these days given that Vietnam’s world standing and economy have improved vastly over the last decade. Moreover, lots of database and application developers working today don’t even remember the war in Vietnam so the allegory may fail for them. But for those of us who grew up in the 1960s and 1970s, the writing evokes plenty of painful memories that paint ORMs as technological quagmires.
ORM tools have evolved quite a bit since 2006 but they’re still generally pretty awful. The metadata-driven code generators buried within tools like Hibernate and Microsoft’s
For most developers who have shunned traditional ORMs, I often hear them speak about how unwieldy, brittle or inflexible they found the tools to be. The relational abstractions that they provide is sometimes too weak, making common business operations difficult to express in languages like C# and Java. To solve that problem, the abstraction may become leaky, exposing details that create dependencies which complicate deployments or lock you into vendor or version-specific features.
For database people, ORM aversion is typically related to the naiveté of machine-generated queries. Chattiness is a big problem with full-service ORMs, for example. Mature ORMs do a better job today emitting correlated sub-queries or common table expressions than they did in the past. But that often depends on good mappings which depend in turn on good metadata. Unfortunately, perfect database metadata is quite rare. Anyone who watches the tracing of a database server suffering under the garrulous assault of an ORM-driven application understands just how little the ORM really knows about the database.
For example, I worked in a shop long ago that used a UniVerse database to manage large caches of medical claims data. As a multi-valued database much like today’s NoSQL databases, UniVerse stores data with all the data that it references in a single document. So a patient’s demographic information can be stored in line with the related medical claims, diagnostic history and lab results. For that system, there was only one index on the database which was used to find individual, semi-structured blobs of patient records. The rules engine which sat atop the database always needed all of a single patient’s information whenever it ran a rule so storing the entire document describing a patient as a single database entity was both practical and highly efficient.
In those days, UniVerse was an IBM product that competed against their own DB2 relational database. For some reason, IBM was keen on getting our company to move from UniVerse to DB2 so we invited them in for a proof of concept to build a relational model of our medical database in DB2. When they were done, they loaded several terabytes of data into the new relational database and aimed the rules engine there. After weeks of tweaking, rewriting and hand-wringing, they never could make the relational database outperform UniVerse. Moreover, it wasn’t even a close race. The UniVerse database running on much older hardware could fetch whole patient documents in a few milliseconds. The best we could get DB2 to do using its normalized implementation was measured in hundreds of milliseconds. The difference between five milliseconds and half a second may not sound like a major problem, but when you’re handling millions of queries per day and trying to satisfy a three second, end-to-end Service Level Agreement with your Point of Sale providers, those delays add up very quickly.
Of course, document-oriented databases like Mongo and multi-valued databases like UniVerse are not always viable solutions to bridging the gulf between on-disk and in-memory data
In that sense, document-oriented databases and ORMs are both attempts to undo history. The former solution means to eliminate the mismatch between applications and databases altogether by making the data storage model match the common data utilization pattern. The problem is that by coercing the database to adapt to the application’s model, many of the features that evolved in transactional systems simply aren’t available. So-called eventual consistency scares developers who have come to depend on databases that offer transactions which are Atomic, Consistent, Isolated and Durable (ACID). This is most unfortunate because had the ENIAC developers had unlimited in-memory storage available to them, I believe that in-memory data handling semantics would have evolved to be ACID-compliant just as they are in modern relational databases. Document-oriented databases redo history in a way that doesn’t honor so many of the good choices that were made in real history.
Rather than attempting to undo history per se, ORMs try to hide history from us by making it seem that the underlying database’s storage and query semantics exist inside the application’s space. While this is a better way to undo history because it preserves the best parts of it, this is perhaps the hardest problem in all of computer science today. The reasons that it is so difficult are myriad. Incomplete metadata is often the culprit. The cardinality between entities, for example, is nearly impossible to get right simply by inspecting a database’s available foreign keys. Moreover, the relationships between entities in separate databases or in separate but related data integration processes often has no metadata to tie them together. The Master Data Management (MDM) structures and processes used to manage code assignments can also have a huge impact on an ORM’s query generation. Languages like C# and Java lack even the most basic Design by Contract (DbC) features so it’s no wonder that their handling of database constraints is also quite weak. For example, imagine a simple database constraint that limits an integer value to multiples of ten. Now imagine an ORM trying to convey and enforce that constraint in a language like C#. You get the picture. This is a very difficult, effectively impossible, problem to solve.
Nowadays, I rely on so-called micro-ORMs and auto-mapper tools more than traditional ORMs. Micro-ORMs are an admission that full-service ORMs are often too complex or that they unnecessarily constrain my use of data. Using less ambitious micro-ORMs like Dapper gives me the benefits I desire without all the weight and ceremony of a full-featured ORM. Lastly, when document-oriented storage seems warranted, I evaluate the criteria objectively and openly admit when NoSQL is the correct choice. However, this is less common than avid NoSQL proponents would have you believe. Relational databases are still the best choices for storage and data handling in the vast majority of cases, mostly because of their maturity, rich feature sets, high reliability and great support.
Nine years later and my friend Ted is still mostly correct. ORMs are an unwinnable war. But they’ve taught us a lot and there are other options for solving what is perhaps the hardest problem out there today: undoing history and redoing it as we think it might have been done under different circumstances. No matter which way the market goes, the technical equivalent of sober diplomacy should triumph over conflict.
If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.
Reference: Pinal Dave (https://blog.sqlauthority.com)