Another Challenge of Big Data Analytics: Data Leak and Spill (Part 1)

by Rajesh Rengarethinam Senior Software Engineer Manager at NextLabs |

In their recent article on Big Data Management and Trends, Gartner identifies Enterprise Data as one of the key challenges facing organizations. The challenge is consolidating data from disparate sources across the extended enterprise and transforming it into critical business intelligence.

“You have many data disparate sources – from your enterprise’s ‘dark data’ and partner, employee, customer and supplier data to public, commercial and social media data – that you need to link and exploit to its fullest value.”
source: http://www.gartner.com/technology/topics/big-data.jsp

The extended enterprise is comprised of disparate data sets, across heterogeneous applications and devices. How can organizations harness all these data sources in one centralized analytics engine? COTS applications are claiming to do precisely this—one example being SAP Business Warehouse, which aggregates data from SAP and other applications and allows users to run comprehensive reports.

However, a different challenge looms on the horizon as organizations rush toward big data analytics—one that is far less talked about: How do how organizations control access to reporting data once it is mined from disparate applications and devices? After all, the same compliance regulations and corporate governance policies should apply to data, no matter where it is consumed: in enterprise applications, on partner networks, or after use when it is consolidated in analytics tools and displayed reporting interfaces.

The technical challenge is harder than it seems. The rules that govern how data should be access, shared, and used is always embedded in the business context of applications where that data originates. When data is mined and aggregated, this critical business context is left behind. How do organizations know how data should be controlled, especially when data is mined from across the extended enterprise?

For instance, assume that a business object is classified as restricted (due to an export compliance or other regulation) in the application where it was created and stored (say, SAP). Controls can be instrumented in that application to ensure proper access and usage. However, when that data is aggregated into reporting and analytics tools, how do you identify restricted data? Do data-level classifications persist from the originating application? Or is data stripped of crucial business context when it is mined and aggregated?

Inside_Rajesh_Data_Centric_Blog

To make the problem worse, most analytics tools allow users to export adhoc reports into PDF or Excel format—so sensitive and restricted data can be widely distributed. The problem goes beyond the set of users who have access to your reporting and analytics applications, in other words. Sensitive data can go into to these analytics tools undetected, then be exported out for broad distribution.

While many analytics tools have basic access controls, it is unclear whether they are robust enough to address this challenge. An effective solution would need to be able to:

  • Retain original data-level classification information when data is mined and aggregated from disparate applications and sources
  • Block access, filter report views, or block export of report information, based on data-level classifications.
  • Apply rights protection to exported report files based on the data they contain, so files will be distributed and accessed in accordance with rules and regulations.

In part 2 of this series, we take a closer look at each of these requirements.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s