Saturday, October 5, 2013

Tableau Friction — Into the Future

It's several weeks now since TCC 2013 in Washington, D.C. and I haven't posted a thing. Not for a lack of topics—I left immediately after the conference for a sailing trip in the Cyclades and have very limited internet access. This post is a short placeholder until I get back in mid October.

About TCC 2013

I loved TCC 2013, got to reconnect with many people I know and respect, learned a lot, and wished only that I could be in more than one place at a time to take it all in.

Tableau Software and the Tableau community keep moving the wave front forward. The data analytical world is a better place because of Tableau, and Tableau is better because of the contributions of everyone who puts so much time, effort, energy, and brilliance into exploring and understanding it, and in extending a welcome and helping hand to all.

From what we've seen of the new 8.1 and 8.2 features Tableau isn't sitting still. The universe of possibility continues to expand.

The future of Tableau Friction.

This blog started out as a way for me to keep track of of Tableau's idiosyncrasies so that I could have a handy reference for them, and for workarounds where they exist.

My goals have broadened.

In the seven years I've been using Tableau I've watched it grow from an innovative disrupter to a powerful shaper of a new way of thinking about data analysis. Tableau has been the biggest agent of change in the data analytical landscape—the relationship between people and their data, and with other people to whom they need to communicate data-based insights and information is much different than it was ten years ago. More importantly, Tableau has led the way in changing the mindscape of people with data to understand, particularly in corporate environments where the simple straightforward activity of data analysis had become hostage to too-large, too-cumbersome, too-industrial processes that actively impeded obtaining information and insights from data in its native context.

One of the consequences of Tableau's success is that the door to data analytical innovation is wide open. Tableau introduced a new way of analyzing data, one that made it simple and easy for people without database and programming skills to access and analyze their information. Now that the idea that data analysis should be simple and easy has taken root and flourished more people are exploring ways to broaden the reach of data analysis, to new types of data, new types of analytical functionality, and new types of analytical modeling and visualizations. The data analysis horse is out of the barn. I intend to explore and cover as much of the ongoing innovation as I can, with an eye towards how it intersects with Tableau's realm.

Continuing coverage of Tableau's friction points.

I will continue to point out those aspects of Tableau that don't work as well as they should, from atomic point problems to systemic design flaws—including where a lack of coherent design impedes functionality and usability. These things are within the original Tableau Friction horizon. There's quite a queue of these topics; the challenge will be to identify the individual friction points, how they relate to one another, and how they reflect higher level organizational characteristics.

As Tableau has grown it really does appear that it's been extended by cobbling together new features in relative isolation from one another, sometimes by employing existing machinery for new purposes, other times by developing entirely new mechanisms. On the whole, there's a lack of coherence across the full Tableau functional spectrum, and this coherence is itself a major source of friction in the product, with passive and active consequences that impede people in their data analysis.

The evolving universe of Machine Assisted Data Analysis (MADA).

When Tableau was invented the machine-stored data universe was dominated by a particular paradigm, one whose dominance was so comprehensive that a whole generation of people knew only it and assumed that it was the one, the only true way to store data. I refer of course to the relational model. And not to Ted Codd's relational model, but of the more general, degenerate concept of data stored in tables of rows and columns. The hows and whys of this model achieved dominance are interesting and informative, but tangential to this post (and I'm sorely tempted to go off on it), but the upshot is that Tableau was designed (I believe) to access tabular information, with the limited exception of analytically less-flexible access to "non-relational" data sources. This initial design space has, like all paradigm-bound endeavors, limitations and boundaries that may be hard or soft horizons. OK, I hear you thinking: "What? Nice sentence, but does it mean anything to me?" Fair enough. Put another way, Tableau's original creation to access tabular data with particular analytical visual semantics may make it unable to adapt to the analysis requirements of other (non-tabular) types of machine-stored data.

Tableau is already limited by a lack of the ability to represent and analyze tablular, flat data, and has been for quite some time. This is a bold claim, and controversial in some quarters. But I hold it to be true. There has long been a need within Tableau for data analysts and dashboard authors to be able to represent non-relational data, e.g. the different components of their workbooks in ways that make it easy to see, understand and manage them. To cite just one example: the difficulty of understanding the relationship between data sources and worksheets has always existed and was exacerbated with the introduction of global filters, which benefited and vexed us, and which have a long train of follow-on effects. The main reason I created TWIS, and (I think) Andy Cotgreave created his TWB Auditor, is to answer these and related questions. The main problem with these solutions is that they exist outside of Tableau, and even though it's possible to create relational data sets covering all of the relationships between workbook elements so that Tableau can be used to analyze them, there are a prohibitively large number of individual tables required—one for each distinct path in the relationship graph.

The post-relational MADA universe.

The future is here, and always has been.

The relational model for storing data was preceded by other models that combined the structure and content of data. The relational model became the dominant approach because of a convergence of factors, none of which were it being a superior way to store data so that humans could make sense of it. In fact, from the perspective of real human people wanting to understand their data the relational model was, and is, a curse and a pox and an abject failure.

Fortunately, non-relational data is making a comeback. The real question for the purposes of this blog is: "Will Tableau adapt and become the best tool for helping make sense of all this new data?" Time will tell.

Non-relational data.

Data is increasingly found in a wide variety of non-relational forms, from XML files to (mostly) web-related structures such as JSON and YAML, to noSQL and big data technologies. Tableau workbooks are XML files. Tableau Server uses YAML internally. JSON is rapidly becoming a significant, even leading data format on the web. noSQL data is becoming more and more prominent.

The common basic characteristic of the non-relational forms is that they accommodate structural relationships in the data, from the simplest two level master-detail hierarchies to arbitrarily complex networks. The great benefit from the analytical point of view is that it's possible to model data naturally, with the structure and content reflecting real-world relationships.

If Tableau is to evolve into a tool that has a place in the post-relational MADA universe it will need to provide a way to model, represent, and analyze post-relational data as clearly and easily as it has relational data. The good news is that Tableau's inventors cracked a tough nut once in bringing a visual interface to what had previously been a programmatic or mechanical endeavor, that Tableau is recognized as the leader in post-Big BI, and Tableau has demonstrated the ability to gather the resources necessary to be in the game for the long term.

I'm preparing a post on the characteristics of the first step up from the relational model—a master-detail hierarchical relationship, specifically of an organization's departments and the employees working in each. This simple example introduces many of the concepts any tool will need to express in order to successfully help people access and understand their data.

I also plan on extending these concepts and principles to higher level data structures. There are some tough problems to solve – one way to look at the situation is that the world isn't exactly awash in highly effective visual XML editors, and XML has been with us for quite some time now.

Evolving technologies.

There is a highly vibrant, even aggressive, evolution going on in data analysis technologies. The most visible of these leverage web-associated technologies, with Javascript-based implementations manipulating SVG and the HTML 5 canvas out in front of the parade. Examples include: D3.js/; Raphaël; and Crossfilter.

The best of these technologies can, in the hands of someone appropriately skilled in their programmatic use and analytical information design, provide data visualizations equal in quality to Tableau's. This might sound like hearsay, but it's prudent to remember that this has always been possible, and that Tableau's great gift was removing the technological programming expertise from the analytical process.

Technologies vs tools.

As the new technologies evolve they're gaining higher and higher levels of abstraction, bringing the data analytical operations closer and closer to the surface. It's really only a matter of time until the final step is taken and tools appear that meld the technologies' capabilities with human-oriented interfaces, resulting in the next generation of data analytical tools. In this scenario these new tools will be the next wave of disrupters, particularly given that they're starting out with an intrinsic understanding with post-relational data, which will make the development of the tooling less a conceptual hurdle that needs to overcome an entrenched paradigm and more a pure design problem centered on finding an effective visual language for representing the essential data analytical operations.

New tools.

Tableau and its cousins have spawned a host of imitators. Companies are springing up striving to be the next great thing in rapid, effective data visualization. Many of them have been founded by hungry young people who believe that they're onto the next great idea, and some of them have very good ideas indeed. Tableau has, by virtue of its own success in leading the creation of an entire new product space, demonstrated that it's possible to bring a good idea to life and, with hard work and luck, upend the conventional wisdom and make a real mark.

I'll cover some of these new tools features that I think are interesting and valuable, and will relate them to Tableau's way of doing things whenever possible. A couple of features that I've seen recently bear mentioning because they highlight valuable data-analytical functionality that aren't in Tableau's wheelhouse.

The first is something everybody wants: search. Simple, plain, ordinary search for all of the data that contains a word or phrase. One new product provides a simple Google-like text input are into which the user can type whatever they want and the data will be filtered to contain only those records containing the text entered. The wrinkle is that the user doesn't need to specify which fields will be examined, by default all of the fields in the current context will be evaluated to determine if they contain the content. This is a really impressive feature. Yes, there are a whole host of background things that come into play, things that matter and need to be addressed, but right up front the user doesn't need to know about or deal with them. And isn't that the whole point?

The second interesting feature is natural language recognition. One product provides, again, a simple text input in the UI into which the user can type simple English language that the tool can interpret as meaningful data analysis instructions, like "show employees with salary over 5,000". Watching someone see this for the first, and seeing it work, is interesting – the machine is responding to them in a way that they find easy and natural. It's worth noting that this isn't really a new feature. There have been commercially successful English-based data analytical languages for decades, since at least 1970. The difference now is that the use of the query language is, or can be, integrated into a tool that presents the opportunity to use the language as an element of and adjunct to the analytical process, not as the whole shebang.

Wrap up.

The universe of machine assisted data analytical possibilities is expanding. And it's expanding at an accelerating rate. There hasn't been this much fertile activity in, well forever. As the universe expands the opportunities to provide the best possible tool for helping people access and understand their data grow in parallel.

Tableau is in the catbird seat. The future is wide open.

 

No comments:

Post a Comment