The current cycle of innovation and especially data innovation has been romanticized and lampooned, in television series (Silicon Valley) in business books and in movies.  The terms used are now familiar to most of us:  “MVP” is not only the best player on the ball team. “Early Exit” is not only when you leave a boring party early.  Rounds A, B, C are not only stages in a darts tournament.  There are good reasons this terminology is slipping into our vocabulary and into our popular cultural—life-changing innovation is taking place; very visible wealth is being created; new companies are changing business history and in some cases, world history.  The message is clear:  experiment, take a risk, take a chance, and you could be rewarded, just look at the Teslas outside Tamarine.  A key component of the culture is prototyping and experimentation, and corporations on the “buy” side see the value in these approaches, and have tried to emulate these behaviors, with some success.  The Cloud can help us all to get to a better place around proof-of-concepts and prototyping, and there are some clear steps to get there.   

On the “buy” side, the terminology seems almost as if it is rooted in military operations:  solutions are “mission critical” and need to be “battle tested” before they are “deployed globally.”  We conduct “tollgates” to make absolutely sure no “defects” make it past the gatekeepers; our procurement departments make sure that our suppliers are “fully vetted.”  We are in this for the long term, and solutions we implement take root foundationally and add value for years—which is a good thing, because the platforms cost tens of millions of dollars to build, implement, and support given current approaches.  But they do work and they do provide controlled and judicious advancement of the goals of the company, with limited risk, and solid integration.  Let those crazy kids in the start-ups have their parties, we have (possibly) less worry about whether our next paycheck will arrive on time, and we have preserved our information assets, thank goodness!

Both cultures exist for a reason:  free thinking, Open Source, controlled chaos and freewheeling venture capital moves innovation forward; Disciplined IT professionals keep the data dial tone of the corporate world from going dead at the worst possible moment.  Reality is grayer—even large companies have always sought innovative solutions, and taken risks to move forward.  And there is an intrinsic discipline to innovation which belies the freewheeling culture of startups.  Large financial services firms have “innovation labs” everywhere from Singapore to Silicon Valley to Bangalore to try to pick up new ideas.  Open source adoption is widespread even at the most conservative corporations. 

Despite the progress which has been made to increase the velocity of innovation in enterprises on the buy side, it is still the case that most corporate types I speak with seem to be living lives of quiet desperation, wrangling giant offshore teams, not able to move fast enough to meet the velocity of their business, burdened by billions of lines of legacy code, hearing the footsteps of the regulators, procurement, their competitors, getting closer and closer.  Let’s just agree that there is still a problem to be solved here—we are still not moving fast enough, cheaply enough, effectively enough-- let’s explore ways to solve this, at least for data management solutions.   

Are there better ways to engage innovation, and are there any new developments which make innovation easier to adopt?  The key is combining the need for corporate innovation with the elasticity and flexibility offered by the Cloud and lower cost resources.  Why would this work, how is it different from what we were doing, and what are the barriers to adoption? 

Wait, why do we need a new way to do Pilots, haven’t we been doing them?   Let’s look at a typical corporate Pilot with a slightly cynical though realistic perspective.  Usually, after a lengthy period seeking approval, this work is done by internal resources who are in an expensive location and who are not expert on the new platform, using internal infrastructure which is slow to come on line and for which the inter-departmental charges are high.  The innovative vendor’s resources who are expert on the platform are spread too thin and are too expensive to be around for long.  The procurement-approved offshore systems integrator is not expert on the new platform either, but will do their best to ramp up, since procurement has mandated that they are the only choice for additional resources, not the boutique vendor who actually knows the platform.   Because the selected platform vendor is well-established, they have no reason to give their services away for free. 

Even with all of these drawbacks, when executed properly, the Pilot has the ability to help the business move forward, by inspiring confidence in the business, allowing tech to learn how to manage the platform, and allowing the gaps in functionality to be seen.  But if you are running one of these projects or your organization is responsible for one of them, due to the reputational investment, dollar investment, and time involved, using the traditional POC/Pilot approach, you get one chance to get it right.  No pressure or anything (!) just get the pilot for the new capability with the new platform exactly and perfectly correct or suffer the consequences.  

And, the pressure is increasing to at least “kick the tires” on new data approaches, because the evidence is mounting that new approaches using open source, big data, data sciences, and operational analytics can yield significant advantages over the legacy platforms used today.  To do the proper experimentation on premise, there are so many hurdles which need to be addressed.  Much of the unstructured and semi-structured data used in these platforms comes from public and cloud based sources, has high variety and volume and cannot be managed with the platforms and tools on your current approved vendors list.  Convincing your infrastructure team to install a leading edge open source platform which has not yet been fully vetted and approved can be a very long conversation.  

Here is one pathway to more innovation: 

  1. Do it in the cloud: The elasticity of the cloud works perfectly for Pilots and POCs, where you are not sure how much storage you will need, the server configuration for the new platform is not your corporate standard, and the surrounding systems and data will need to be adjusted to make the integration work.  Use data obfuscation to alleviate (often unfounded) worries about security in the cloud, or just sit down with your CISO and work through the concerns.
  2. Do it with a boutique vendor: For every Big Data, Data Science, IOT platform, there are smaller suppliers who would love to help you.  They have the resources you need, and are willing to work with you at a low cost, typically leveraging offshore resources. 
  3. Do Risk Assessment Differently and Measure Success Differently:If the above steps drive the cost of the Pilot down from $1MM to $100K, should the risk assessment of project failure be performed the same way?  You could fail five times, learn a lot more about the platform, and still save half a million dollars.
  4. Minimize on premise technical resources: I know, your team has to learn this.  But wait until you prove that this solution works and inspire business confidence before you start that step.  Involve your resources in the management of the solution, not in the development, at least not during the Pilot/POC phase.
  5. Push the Envelope and Innovate: Now that you have created an idea factory, at a low cost, use it to push the envelope and truly innovate.  Talk to the business about what they want beyond the basics, and try to bake some of that into the solution.

So, if that is how you “do it”, what is it that you should “do?”  Here are some Big Data POCs/Pilots which you can do, to foster innovation.  Note that these are selected examples but there are actually many more possibilities, only limited by your business needs and imagination: 

  • Data Warehouse Staging:  The most typical style of staging is relational tables on the RDBMS platform you are already using, either Oracle, Teradata, or DB2.  If you place these staging areas on some variant of HDFS many good things happen:  Your cost of RDBMS licenses goes down; you get to keep the image of the inbound data forever, which can have exquisite payback from a compliance/audit perspective; your staging area is now cheaply massively parallel and will perform better.  POC:  Take a small segment of your inbound data and port it to HDFS.
  • Data Appliance Takeout: Oracle and SAP have done a superb job of engineering the Exadata and HANA platforms and placing pre-tuned applications on top of the stack (examples:  Hyperion Planning; SAP Business Warehouse).    Your implementation may be costly and “inelastic” (unable to expand easily without significant additional cost) which can be a concern as you try to expand utilization.  Amazon Web Services has a version of a data appliance available today for about $1,000 per Tb per year.  Of course, it does not come pre-loaded with Hyperion or the SAP BW, but the price point might open up other possibilities.  POC:  Port a portion of the current data appliance functionality to AWS Redshift. 
  • ODS Analysis/Event/Transaction Data Science: How many times has someone in your organization said “our information is our biggest asset?”  They are right, and this is never truer than when looking at the actual operational data for your enterprise.    POC:  Dump transactional events into HDFS, and use contractor data scientists who know your business but who are not embedded in your culture (read: mess) to uncover new insights.
  • Structured/Unstructured Mash Up: Call Center Platforms are currently largely blind to consumer sentiment and only serve up dry customer facts.  What your rep really wants is to offer a memorable customer experience, which is virtually impossible without knowing what mood and mode the customer is in.  By taking sentiment analysis from prior calls, you can predict the mood and disposition of the customer as an archetype and on an individual basis.  This can send customer satisfaction scores through the roof and inform all aspects of customer contact.  POC:  Take call center conversations, dump them in HDFS, do speech to text conversion, do natural language processing and sentiment analysis, and offer this as a trial application synchronized with the call center platform for a small segment of the customer base. 

There is a better way to conduct a POC or a Pilot, which is more direct, cost effective, and most importantly brings a real innovation cycle into your enterprise.  This process will bring a little bit of start-up culture to your enterprise, and while it might not put a Tesla in your driveway, it could make you feel like the cultural divide between corporate IT and Silicon Valley start-ups just got a lot smaller.  This is not to imply that the corporate on premise reliable, fault tolerant, optimized data center might not be the best home for the app once it moves beyond the POC/Pilot stage.  It is just that the corporate on premise data center might not be the best first place to foster innovation. 

Elevondata (www.elevondata.com) is a leading edge data management advisory and data lake solutions company which is well positioned to help with the POCs and Pilots listed above.  Vin Siegfried is one of the founders and can be reached at vsiegfried@elevondata.com.

Author Vin Siegfried with Rohit Tandon, Steve Rycroft, and Dan Meers