23
Jan

SQL remains a useful foundation for building tools to analyze data

2-Color-Design-Hi-Res-100px-widthA lot of editorial content about data, and tools built for data analysis, includes a Pavlov-like association between “big data” and “modern” computing. Relational database approaches to addressing data in a form built to accommodate analysis with Structured Query Language (SQL) tools are treated as a dated approach, somehow behind the times.

But much of this content fails to inform the people reading it about just how these “modern” computing systems actually work. For better or worse, Relational databases, which provide a structure (perhaps backbone would be a better word) for information, are, at some point in the process of analyzing electronic information (data), indispensable.

As a rule of thumb, the best examples of editorial content written on topics relevant to this subject, will incorporate some respect for SQL. David Chappell of Chappell & Associates has written a white paper for Microsoft, titled Introducing DocumentDB A NoSQL Database for Microsoft Azure (http://azure NULL.microsoft NULL.com/en-us/documentation/articles/documentdb-whitepaper-chappell/), which, follows this route. Chappell writes: “To support applications with lots of users and lots of data, DocumentDB is designed to scale: a single database can be spread across many different machines . . . .DocumentDB also provides a query language based on SQL, along with the ability to run JavaScript code directly in the database as stored procedures and triggers with atomic transactions.”

From Chappell’s description it should be clear DocumentDB has been built to replicate some of the core planks of Relational Database Management Systems (RDBMS) best practices. These certainly include SQL tools along with stored procedures, and triggers. Enterprise consumers of RDBMS and/or NoSQL collections of data will approve of the end of Chappell’s sentence: “atomic transactions”. This phrase provides these readers with an important assurance: DocumentDB has been built with ACID “Atomicity, Consistency, Isolation and Durability” transaction process in mind. ACID data communications is the floor supporting today’s commercial quality electronic transactions. Without an ACID compliant structure on both sides of a commerce transaction, businesses are not likely to exchange information. The negative ramifications of such a condition are great, so “modern” best practices have been built with an assumption of ACID compliance as a given.

Unfortunately non relational database systems are challenged to demonstrate ACID compliance. This fact is not lost on Chappell. The white paper he has written for Microsoft presents a balance between big data, NoSQL and SQL and RDBMS concepts in a coherent presentation. In my opinion other technical writers would benefit from his approach. I suspect Chappell’s success at his effort is a direct result of his technical understanding of how these systems actually work.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2015 All Rights Reserved

19
Dec

Success Stories and Case Studies do serve a purpose for enterprise technology consumers

2-Color-Design-Hi-Res-100px-widthIf ISVs with offerings targeted to enterprise computing markets needed any more indication of the importance of case studies and success stories, they likely got what they needed in an article written by Elizabeth Dwoskin, which was published on December 16, 2014 on the Online Wall Street Journal web site.

The title of Dwoskin’s article is The Joys and Hype of Software Called Hadoop (http://www NULL.wsj NULL.com/articles/the-joys-and-hype-of-software-called-hadoop-1418777627?mod=LS1). The reason her article should alert any ISVs still in the dark as to why they absolutely require a marketing communications effort, which will produce success stories and case studies can be found in the following quote:

  • “Yet companies that have tried to use Hadoop have met with frustration. Bank of New York Mellon used it to locate glitches in a trading system. It worked well enough on a small scale, but it slowed to a crawl when many employees tried to access it at once, and few of the company’s 13,000 information-technology workers had the expertise to troubleshoot it. David Gleason, the bank’s chief data officer at the time, said that while he was a proponent of Hadoop, ‘it wasn’t ready for prime time.'” (quoted in entirety from Dwoskin’s article in the WSJ. I have provided a link to the entire article, above and encourage readers to spend some time on it)

This comment from a large enterprise consumer — BNY Mellon — which can be read as less than positive, can (and likely will) do a lot to encourage peers to look a lot closer at Hadoop prior to moving forward on an implementation.

Bottom line: enterprise businesses do not like to proceed where their peers have hit obstacles like the one Gleason recounts in his comment. Peer comparisons are, arguably, a very important activity for enterprise business consumers. So ISVs working with Hadoop on big data offers, or NoSQL databases and related analytics need to make the effort to queue up positive comments about consumer experiences with their products.

I recently wrote a set of posts to this blog on Big Data, NoSQL and JSON and must admit to experiencing some difficulty finding the case studies and success stories I needed to gain a perspective on just how enterprise consumers have been using products presented as solutions to the market for these computing trends. Hortonworks (http://www NULL.hortonworks NULL.com), on the other hand, is an exception. So I would encourage any readers after the same type of testimonial content about customer experience with products to visit Hortonworks on the web.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

17
Dec

Google Debuts Cloud Dataflow at Google I/O 2014

2-Color-Design-Hi-Res-100px-widthAt the end of a 2.5 hr plus webcast of the Keynote Presentation from Google I/O 2014 (https://www NULL.google NULL.com/events/io#wtLJPvx7-ys) can be found the debut of Google Cloud Dataflow, the replacement for Google MapReduce. Readers unfamiliar with MapReduce, but avidly interested in the big data enterprise computing trend, need to understand MapReduce as the application at the foundation of today’s Apache Hadoop project. Without MapReduce, the Apache Hadoop project would not exist. So Google MapReduce is a software package worth some study, as is Cloud Dataflow.

But wait, there’s more. As Urs Hölze, Senior Vice President, Technical Infrastructure, introduces Google Cloud Dataflow, his audience is also informed about Google’s role in the creation of another of today’s biggest enterprise data analytics approaches — NoSQL (“Not only SQL”). He casually informs his audience (the segue is a simple “by the way”) Google invented NoSQL.

I hope readers will get a feel for where I’m headed with these comments about these revelations about Google’s historical role in the creation of two of the very big trends in enterprise computing in late 2014. I’m perplexed at why Google would, literally, bury this presentation at the very end of the Keynote. Why would Google prefer to cover its pioneering role in these very hot computing trends with a thick fog? Few business decision-makers, if any, will be likely to pierce this veil of obscurity as they search for best-in-class methods of incorporating clusters of servers in a parallel processing role (in other words “big data”) to better address the task of analyzing text data scraped from web pages for corporate sites (“NoSQL”).

On the other hand, I’m also impressed by the potential plus Google can realize by removing this fog. Are they likely to move in this direction? I think they are, based upon some of the information they reported to the U.S. SEC in their most recent 10Q filing for Q3 2014. Year-over-year, the “Other Revenues” segment of Google’s revenue stream grew by 50% from $1,230 (in 000s) in 2013, to $1,841 in 2014. Any/all revenue Google realizes from Google Cloud and its related components (which, by the way, include Cloud Dataflow) are included in this “Other Revenues” segment of the report. For the nine months ending September 30, 2014, the same revenue segment increased from $3,325 in 2013, to $4,991 in 2014. Pretty impressive stuff, and not likely to diminish with a revamped market message powering “Google at Work”, and Amit Singh (late of Oracle) at the head of the effort.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

15
Dec

Who’s losing sleep over NoSQL?

One of the biggest challenges facing product marketing within any business is successfully identifying a market segment. I would argue more businesses fail because they either:

  1. don’t understand their market niche
  2. or can’t articulate a message intelligible to their market niche
  3. The next step is to put together a portrait of an ideal prospect within this segment. Over time, if a business is lucky enough to succeed, this portrait will likely change (perhaps scale is a better word). After all, early adopters will spread the word to more established prospects. The latter are more conservative, and proceed at a different pace, based upon different triggers.

The 3 steps I’ve just identified are no less a mandatory path forward for early stage ISVs than they are for restaurants, convenience stores, or any other early stage business.

But a lot of the marketing collateral produced by early stage ISVs offering NoSQL products and solutions, in my opinion, doesn’t signal a successful traverse of this path. In an interview published on December 12, 2014, Bob Wiederhold, CEO of CouchBase presents the first and second phases of what he refers to as “NoSQL database adoption” by businesses. Widerhold’s comments are recorded in an article titled Why 2015 will be big for NoSQL databases: Couchbase CEO (http://www NULL.zdnet NULL.com/article/why-2015-will-be-big-for-nosql-databases-couchbase-ceo/).

My issue is with Wiederhold’s depiction of the first adopters of NoSQL Databases: “Phase one started in 2008-ish, when you first started to see commercial NoSQL products being available. Phase one is all about grassroots developer adoption. Developers would go home one weekend, and they’ll have heard about NoSQL, they download the free software, install it, start to use it, like it, and bring it into their companies”.

But it’s not likely these developers would have brought the software to their companies unless somebody was losing sleep over some problem. Nobody wants to waste time trying something new simply because it’s new. No insomnia, no burning need to get a good night’s rest. What I needed to hear about was just what was causing these early adopters to lose sleep.

I’m familiar with the group of developers Wiederhold portrays in the above quote. I’ve referred to them differently for other software products I’ve marketed. These people are the evangelists who spread the word about a new way of doing something. They are the champions. Any adoption campaign has to target this type of person.

But what’s missing is a portrait of the tough, mission-critical problem driving these people to make their effort with a new, and largely unknown piece of software.

It’s incumbent on CouchBase and its peers to do a better job depicting the type of organization with a desperate need for a NoSQL solution in its marketing communications and public relations efforts.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

11
Dec

The NoSQL notion suffers from some of the same ambiguity plaguing the notion of big data

Readers interested in finding out what NOSQL is all about will benefit from simply developing some familiarity with the definition of this acronym. NOSQL stands for “not only SQL”. I found this definition to be very helpful as it helped me correct my first misunderstanding about this notion. I thought NOSQL referred to a set of software tools designed to work with text, document, databases lacking the columnar table structure their Structured Query Language (SQL) siblings thrive upon.

But my understanding was wrong, which, unfortunately for businesses championing a NOSQL approach, may be the case of a lot of the enterprise user segment of the enterprise computing market for NOSQL analytics and the tools required for their delivery. mongoDB (http://www NULL.mongodb NULL.com/nosql-explained) is an example of a database built to conform to NOSQL.

But as the cliche goes “the best of all intentions” can go astray, as is the case, in my opinion, for the mongoDB definition. The average consumer of enterprise computing solutions built to work with social media conversations culled from lots of web pages, likely a chief marketing officer for a popular consumer brand-name, isn’t likely to be able to understand how “Document databases pair each key with a complex data structure known as a document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents” (quoted from the mongoDB web page presentation).

Further, characterizing the choices facing the enterprise consumer as an either “RDBMS” or “non RDBMS” isn’t going to be helpful if the literal definition of the NOSQL acronym is applied. As MapR© points out on its web site, an optimum approach to implementing NOSQL analytics is to combine SQL and text query tools built with JSON components to digest the same data, which, admittedly be incorporated into a mongoDB database, but came, originally from an RDBMS.

What’s even more surprising about the page on the mongoDB website is the light it sheds on a programming effort by a much larger, and much more mature ISV, namely Microsoft: “Graph stores are used to store information about networks, such as social connections. Graph stores include Neo4J and HyperGraphDB”. Hmmm . . . Now “Office Graph”, which is the predecessor of “Delve”, makes a lot more sense.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

9
Dec

Hadoop attracts support from Microsoft and Intel

The Apache Hadoop project (http://hadoop NULL.apache NULL.org/#What+Is+Apache+Hadoop%3F) “develops open-source software for reliable, scalable, distributed computing” (quoted from the “What is Apache Hadoop?” section of the site). So it makes sense for Microsoft and Intel to enthusiastically support the project. Microsoft is deeply committed to its cloud, IaaS effort, Azure (http://www NULL.azure NULL.com), and one of the prime revenue generators for Intel is its Data Center Business (http://www NULL.intel NULL.com/content/www/us/en/search NULL.html?keyword=data%20center). Azure and Intel’s Data Center business are both all about lots and lots of computer servers. The former consumes servers, while the latter provides the CPUs driving them.

As I wrote in the previous post to this blog, it’s likely a majority of the enterprise consumer segment of the tech reader community maintains a questionable understanding of the notion of “big data”. But, when correctly understood, it should not be a stretch for readers to understand why the Apache Hadoop project (or its OpenStack (http://www NULL.openstack NULL.org) competitor) are positioned at the very core of this technology trend.

Microsoft and Intel are not the only mature ISVs looking to benefit from big data. IBM and EMC are two other champions with solutions on the market to add value for enterprises looking to implement Hadoop.

Intel ostensibly understands the ambiguity of the notion of “big data”, and the imperative of providing the enterprise business consumer with a clearer understanding of just what this buzzword is really all about. A section of the Intel web site, titled Big Data, What It Is, Why You Should Care, and How Companies Gain Competitive Advantage (http://www NULL.intel NULL.com/content/www/us/en/big-data/big-data-101-animation NULL.html) is an attempt to provide this information.

But Intel’s effort to educate the consumer, in my opinion, falls into the same swamp as a lot of the other hype before it can deliver on its promise. The amount of data may be growing exponentially, as the opening of the short Intel animation on the topic contends, but there are a lot of mature ISVs (Oracle, IBM, Microsoft, etc) with relational database management systems, designed for pricey big server hardware, which are capable of providing a columnar structure for the data.

Even when “unstructured data” is mentioned, the argument is shaky. there are solutions for enterprise consumers like Microsoft SharePoint (specifically, The Term Store service), which are designed to build a method of effectively pouring text data into an RDBMS, for example SQL Server (the terms are added to SQL Server and are used to tag the text strings identified in unstructured data).

I am not arguing for the sole use of traditional RDBMSs, with SQL tools to manage a data universe experiencing exponential growth. Rather, I think big data proponents (and Hadoop champions) need to perform a closer study on what the real benefits are of clustering servers and then articulate the message for their enterprise computing audience.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

4
Nov

Microsoft looks at the inevitability of big data

Jason Zander, a Corporate Vice President at Microsoft opened his segment of the Keynote presentation for Microsoft’s Tech Ed Europe 2014 (http://channel9 NULL.msdn NULL.com/Events/TechEd/Europe/2014/KEY01) with a compelling argument for the inevitability of big data. Zander presented some numbers indicating the global population of smart devices has now surpassed the entire human global population. The number of apps supporting these devices, and their users has also grown in geometric proportion. The result is truly big data — an enormous amount of information about each/every touchpoint for devices, users, and even data itself as they interact.

Zander’s rhetorical argument is yet one more articulation of one of the core planks of Microsoft’s 2014 communications brand — productivity. To sum up this theme, readers are asked to simply consider the impact of the “hundreds and hundreds of petabytes of data we already have” on the notion of what this writer refers to as the “dawn” of “information opacity” aka the Samuel Coleridge phenomenon (“Water, water everywhere, but nary a drop to drink”).

Zander points to cloud, and Microsoft’s Azure as a leading example of it, as the only method of powering all of the data produced by the global interaction of users and smart devices. It’s worth noting his mention of telemetry. There will be more to be said about this category of data, and its relation to the concept of an Internet of Things (IoT) throughout the remainder of the conference.

The presentation then shifts to another core plank of Microsoft’s 2014 communications brand — the slogan, first articulated by its CEO, Satya Nadella, and now re-articulated by each and every other spokesperson (including Zander) “Mobile First, Cloud First”. Zander echoes Nadella’s recent comments on the slogan, and pulls in the scalability plank of the market message. Mobile First, he stresses for his audience, requires ISVs like Microsoft to envision consumers in motion, implementing different devices, at different times, with the objective of accomplishing the same tasks or activities. The only way to satisfy this need for a uniform computing experience is to deliver the same quality across any/all device form factors. Nothing less will do.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

31
Oct

Big Data, Business Intelligence, and Predictive Analytics all make up a segment of the Productivity theme of computing in late 2014

An important segment of the productivity theme for computing in the fall of 2014 is composed of big data, business intelligence and predictive analytics. Early stage ISVs with solutions in one of these three high demand needs will benefit by crafting market messages around the same productivity theme articulated by their more mature ISV siblings.

Here’s why:

Big Data

the information filtering implicit to the productivity theme, as this writer presented in a prior post to this blog, is a mission-critical component of the complete solution. So Big Data methods of collecting, reposing, categorizing, and, ultimately, processing information are invaluable to a successful effort to enhance productivity for the entire computing “ecosystem” from individual user to collections of organizations.

Business Intelligence (BI)

The BI toolset provides the user interface for the same range of computing users (meaning from individual to sets of organizations) to depict the comparative importance of segments of information and, subsequently, to assimilate it. The charts and other dashboard elements typical of BI presentations render the information into a form users can easily understand. This information, in turn, provides users with bases of action, as required.

Predictive Analytics (PA)

Machine learning is a popular term, which is widely used by players in the productivity market. Machine learning can be applied to the PA computing task. But PA can also be manually expressed by users. The objective of PA is consistently expressed across most productivity messaging as an effort to heighten the value of computing activity, and, ultimately, to increase return on investment and the value of computing activity.

The above points are merely suggestions for how an early stage ISV with a solution in one, or all three portions of this brand segment might choose to articulate a message. If you would like to hear more about how your business might benefit by building your brand within the context of the productivity theme articulated by each of the major ISVs, please don’t hesitate to contact us. We would be eager to learn more about what you are after. As well, we pursue opportunities to contribute to the success of this kind of marketing communications effort on a consulting basis.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

15
Oct

ISVs debut cloud, SaaS solutions to satisfy consumer appetite for Analytics and Data

On Monday, October 13, 2014, Salesforce.com announced the debut of a new cloud, SaaS solution named “Wave” (https://www NULL.salesforce NULL.com/company/news-press/press-releases/2014/10/141013 NULL.jsp). Back on September 16, 2014, IBM announced “Watson Analytics”, once again, a cloud SaaS, but, this time, a freemium offer. So it’s safe to say Analytics for the masses has become a new competitive ground for big, mature ISVs to contend for more market share.

A couple of points are worth noting about the Salesforce.com press release:

  1. GE Capital is mentioned as already using Wave. Given GE’s own recent PR campaign around its own data and analytics effort, one must wonder why the business finance component of the company opted not to use the home grown solution ostensibly available to it
  2. Informatica is mentioned as an “ecosystem” partner for Wave and released its own press release, titled Informatica Cloud Powers Wave, the Salesforce Analytics Cloud, to Break Down Big Data Challenges and Deliver Insights (http://www NULL.marketwatch NULL.com/story/informatica-cloud-powers-wave-the-salesforce-analytics-cloud-to-break-down-big-data-challenges-and-deliver-insights-2014-10-13)

The Wave announcement follows, by less than a month, IBM’s announcement of a freemium offer for “Watson Analytics”, and Oracle’s “Analytics Cloud”. Both of these offers are delivered via a cloud, SaaS model. So it’s likely safe to say enterprise technology consumers have demonstrated a significant appetite for analytics. The decision by Salesforce.com, IBM, and Oracle to all deliver their solutions via a cloud, SaaS offer speaks to the new enterprise computing topology (a heterogeneous computing environment) and the need to look to browsers as the ideal thin clients for users to work with their data online.

An ample supply of structured and unstructured data is likely motivating these enterprise tech consumers to look for methods of producing the kind of dashboards and graphs each of these analytics offers is capable of producing. With data collection methods advancing, particularly for big data (unstructured data), this appetite doesn’t look to abate anytime soon.

ISVs with solutions already available, principally Microsoft with its suite of Power tools for Excel (PowerBI, PowerPivot, etc), may also be participating in this “feeding frenzy”. It will be interesting to see how each of the ISVs with offers for this market fare over the next few business quarters.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved

14
Oct

General Electric Steps Into Big Data and Analytics

October 8, and 9, 2014 were a very busy two days for the Public Relations team at General Electric (http://www NULL.genewsroom NULL.com/). No less than 4 press releases were published about the first steps this very mature — not to mention very large — business has stepped into big data and analytics.

Consider, for example, how the big data and analytics business at General Electric ramped up to over $1Bil in sales: October 9, 2014, Bloomberg publishes an article written by Richard Clough, titled GE Sees Fourfold Rise in Sales From Industrial Internet (http://www NULL.bloomberg NULL.com/news/2014-10-09/ge-sees-1-billion-in-sales-from-industrial-internet NULL.html). Clough reports “[r]evenue [attributed to analytics and data collection] is headed to about $1.1 billion this year from the analytics operations as the backlog has swelled to $1.3 billion”.

Early stage ISVs looking with envy at this lightning-fast entry should consider how scale, along with a decision to acquire IP via partnerships and acquisitions (rather than opting to build it in-house), and picking the right market made this emerging success story a reality. Let’s start by considering these three points in reverse order:

  1. Picking the right market: GE opted to apply its new tech to a set of markets loosely collected into something they call the “Industrial Internet”. These markets include Energy (exploration, production, distribution), Transportation, Healthcare, Manufacturing and Machinery. Choosing these markets makes complete sense. GE is a leader in each of these already. Why not apply new tech to old familiar stomping grounds?
  2. Leverage partnerships and acquisitions to come to market in lieu of rolling your own: Leading players in each of the markets GE opted to enter expressed burning needs for better security and better insight. Other players in each of the markets (Cisco, Symantec, Stanford University and UC Berkeley) all stand to benefit from the core tech GE brings to the table, so persuading them to partner was likely to have been a comparatively easy task. The most prominent segment of the tech (very promising security tech for industrial, high speed data communications over TCP/IP, Ethernet networks) understandably, came into the package from wurldtech, a business GE opted to acquire
  3. Scale: With GE’s production run rate of turbines, locomotive engines, jet engines, and other complex, massive industrial machinery, the task of finding a home for the millions of industrial sensors required to feed the analytics piece of the tech with the big data it desperately needs, does not look to have been a difficult task. Product management, appropriately, looked into its own backyard to find the consumers required to ramp up to scale in very fast time.

In sum, GE’s entry into this market, if the “rubber hits the road” and metrics bear out claims, looks to be a case study early ISVs should memorize as they plan their tech marketing strategy.

Ira Michael Blonder

© IMB Enterprises, Inc. & Ira Michael Blonder, 2014 All Rights Reserved