SQL remains a useful foundation for building tools to analyze data

2-Color-Design-Hi-Res-100px-widthA lot of editorial content about data, and tools built for data analysis, includes a Pavlov-like association between “big data” and “modern” computing. Relational database approaches to addressing data in a form built to accommodate analysis with Structured Query Language (SQL) tools are treated as a dated approach, somehow behind the times.

But much of this content fails to inform the people reading it about just how these “modern” computing systems actually work. For better or worse, Relational databases, which provide a structure (perhaps backbone would be a better word) for information, are, at some point in the process of analyzing electronic information (data), indispensable.

As a rule of thumb, the best examples of editorial content written on topics relevant to this subject, will incorporate some respect for SQL. David Chappell of Chappell & Associates has written a white paper for Microsoft, titled Introducing DocumentDB A NoSQL Database for Microsoft Azure, which, follows this route. Chappell writes: “To support applications with lots of users and lots of data, DocumentDB is designed to scale: a single database can be spread across many different machines . . . .DocumentDB also provides a query language based on SQL, along with the ability to run JavaScript code directly in the database as stored procedures and triggers with atomic transactions.”

From Chappell’s description it should be clear DocumentDB has been built to replicate some of the core planks of Relational Database Management Systems (RDBMS) best practices. These certainly include SQL tools along with stored procedures, and triggers. Enterprise consumers of RDBMS and/or NoSQL collections of data will approve of the end of Chappell’s sentence: “atomic transactions”. This phrase provides these readers with an important assurance: DocumentDB has been built with ACID “Atomicity, Consistency, Isolation and Durability” transaction process in mind. ACID data communications is the floor supporting today’s commercial quality electronic transactions. Without an ACID compliant structure on both sides of a commerce transaction, businesses are not likely to exchange information. The negative ramifications of such a condition are great, so “modern” best practices have been built with an assumption of ACID compliance as a given.

Unfortunately non relational database systems are challenged to demonstrate ACID compliance. This fact is not lost on Chappell. The white paper he has written for Microsoft presents a balance between big data, NoSQL and SQL and RDBMS concepts in a coherent presentation. In my opinion other technical writers would benefit from his approach. I suspect Chappell’s success at his effort is a direct result of his technical understanding of how these systems actually work.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2015 All Rights Reserved


A Microsoft Perspective on NoSQL and Document Databases

2-Color-Design-Hi-Res-100px-widthIn November, 2011, Julie Lerman wrote a post for Microsoft’s MSDN Magazine on Document Databases. The title of her post is What the Heck Are Document Databases? Her post may provide business sponsors of NoSQL database projects with useful information about the notion of NoSQL, and, therefore is recommended reading material.

What prompts me to recommend this post for business stakeholders in NoSQL projects (aka Gartner’s “Citizen Developers”) is the comparative lack of abstraction characterizing Lerman’s presentation. She quickly identifies document databases as one of several types of NoSQL databases (she also presents “key-value pair” databases and points to Azure Table Storage as an example). Here’s a great example of the simplicity of Lerman’s presentation of the notion of NoSQL: “The term is used to encompass data storage mechanisms that aren’t relational and therefore don’t require using SQL for accessing their data.”

For some business readers even this short definition may be challenging. Just what does she mean when she presents her notion of “data storage mechanisms that aren’t relational?” It would, perhaps, have been helpful for the audience I have targeted to add an additional sentence, to simply illustrate how rows and columns in tables, which are, defacto, “relational” components (or structure) actually offer users a method of storing information. Kind of like “I know where you are, therefore, dear data, you have been stored SOMEWHERE”.

But the business user is likely not Lerman’s intended audience. This post appears in Microsoft’s MSDN (Microsoft Developer Network) Magazine, so the intended audience, I would assume, are coders working with Microsoft tools (.NET, C#) via VisualStudio. Nevertheless, sections of the post (like the one’s I’ve quoted, above) are certainly worth a read by the audience I have in mind, as well.

Here’s more useful information. As I wrote last week, the definition of NoSQL, “Not Only Structured Query Language” is a useful text string to keep in mind when grappling with hype about “radically different” approaches to managing data, or “getting rid of” relational databases. Back in November, 2011, when Lerman published her post, she drills down into defining the NoSQL acronym, too, by pointing her readers to a post by Brad Holt of the CouchDB project. The title of Holt’s post is Addressing the NoSQL Criticism, which he handles by noting “First, NoSQL is horrible name. It implies that there’s something wrong with SQL and it needs to be replaced with a newer and better technology. If you have structured data that needs to be queried, you should probably use a database that enforces a schema and implements Structured Query Language. I’ve heard people start redefining NoSQL as “not only SQL”. This is a much better definition and doesn’t antagonize those who use existing SQL databases. An SQL database isn’t always the right tool for the job and NoSQL databases give us some other options.” (this quote is excerpted, in entirety, from Brad Holt’s post. I’ve provided a link here to the complete post and encourage readers to read the post in entirety.).

So if you need to get a good understanding about the Document Database type of NoSQL structure, I recommend reading Lerman and Holt’s posts.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved


Who’s losing sleep over NoSQL?

One of the biggest challenges facing product marketing within any business is successfully identifying a market segment. I would argue more businesses fail because they either:

  1. don’t understand their market niche
  2. or can’t articulate a message intelligible to their market niche
  3. The next step is to put together a portrait of an ideal prospect within this segment. Over time, if a business is lucky enough to succeed, this portrait will likely change (perhaps scale is a better word). After all, early adopters will spread the word to more established prospects. The latter are more conservative, and proceed at a different pace, based upon different triggers.

The 3 steps I’ve just identified are no less a mandatory path forward for early stage ISVs than they are for restaurants, convenience stores, or any other early stage business.

But a lot of the marketing collateral produced by early stage ISVs offering NoSQL products and solutions, in my opinion, doesn’t signal a successful traverse of this path. In an interview published on December 12, 2014, Bob Wiederhold, CEO of CouchBase presents the first and second phases of what he refers to as “NoSQL database adoption” by businesses. Widerhold’s comments are recorded in an article titled Why 2015 will be big for NoSQL databases: Couchbase CEO.

My issue is with Wiederhold’s depiction of the first adopters of NoSQL Databases: “Phase one started in 2008-ish, when you first started to see commercial NoSQL products being available. Phase one is all about grassroots developer adoption. Developers would go home one weekend, and they’ll have heard about NoSQL, they download the free software, install it, start to use it, like it, and bring it into their companies”.

But it’s not likely these developers would have brought the software to their companies unless somebody was losing sleep over some problem. Nobody wants to waste time trying something new simply because it’s new. No insomnia, no burning need to get a good night’s rest. What I needed to hear about was just what was causing these early adopters to lose sleep.

I’m familiar with the group of developers Wiederhold portrays in the above quote. I’ve referred to them differently for other software products I’ve marketed. These people are the evangelists who spread the word about a new way of doing something. They are the champions. Any adoption campaign has to target this type of person.

But what’s missing is a portrait of the tough, mission-critical problem driving these people to make their effort with a new, and largely unknown piece of software.

It’s incumbent on CouchBase and its peers to do a better job depicting the type of organization with a desperate need for a NoSQL solution in its marketing communications and public relations efforts.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved


The job of classifying large amounts of text data becomes easier with JSON

The final cloud-like computing theme contributing to the unfortunate fog around the notion of “big data” is JSON. In my opinion, enterprise consumers of big data solutions built with NOSQL databases aren’t going to be able to connect the dots from the presentation on the JSON open-source project homepage.

More intelligible information about JSON for the non programmer can be found on the web site of the Apache CouchDB project. “CouchDB is a database that completely embraces the web. Store your data with JSON documents. Access your documents and query your indexes with your web browser, via HTTP” (quoted from the first sentence of editorial content published on the site). Quering indexes with your web browser, hmmm . . . might this have something with Chrome’s Omnibox? In fact, as any reader following the link just provided will note, it does.

So now with this flexibility in mind, it might provide enterprise computing consumers with more of a rationale for calling for the implementation of databases conforming to JSON, which will lend themselves to analytics built with NOSQL tools. If the process of collecting data on some aspect of a business process can be reduced down to little more than punching some keywords into Chrome’s Omnibox (a version of which is now available for Firefox and Internet Explorer), then Lines of Business (LoBs) can count on their personnel getting to the data they need, when they need it, from any device (mobile, desktop, laptop) whenever they need it without the need for any proprietary solution.

Pretty cool. The cool factor increases when one reads more about the CouchDB project. JSON represents an alternative to XML, which requires substantially more verbosity (meaning lines of code) to express the same programming statement. Lots of lines of code contribute to a slower web, where pages can take forever to load. So the comparatively lighter weight promised by using JSON to express steps in a program makes a lot of sense. The intention of JSON and XML are the same, namely to provide a method of data exchange.

JSON produces “JSON Documents”. Here’s an example of what IBM© is doing with JSON: Search JSON documents with Informix.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved