Story and Big Data

Three Ways to Avoid a Data Doomsday

Big data is a double-edged sword. Companies have never been more empowered to track anything and everything. The problem?

Organizations frequently find themselves buried under a big data avalanche. And the problem is only going to get worse.

Gartner predicts that one-third of Fortune 100 companies will experience an information management crisis by 2017, due to the fact that many U.S. companies can’t “effectively value, govern and trust their enterprise information.”

The challenge that companies face is that they lack a clear data strategy. In the interim, they’re monitoring everything. What ends up happening, however, is information overload.

For years, this approach worked — data wasn’t a critical business function. But today, IT has become a cornerstone of the most critical c-suite discussions, and data is at the center of the most critical business operations. Push has come to shove, and organizations need a data strategy now.

Organizations can avoid their “data doomsday” by concentrating on the following core areas:


In its most recent State of Cybercrime Report, EMC observed that cyber criminals will leverage big data principles to increase the effectiveness of attacks.

Malware developers have created solutions that extract pertinent data from organizations. You heard right: they’re using big data analytics for evil.

Without the right protection mechanisms, a business’s big data strategy could easily become a liability. It is absolutely crucial to make sure that all channels remain protected. Big data should not mean vulnerable data.


www.insidedevices.orgThe most effective big data strategies are actionable. Organizations need more than disparate metrics, however, to drive margin.

That’s where storytelling comes in. Look for dimensions, hooks, and angles that demonstrate trends. What have been your changes over time, for instance? What variables are driving those changes? What are your organization’s most pressing opportunity costs?

Storytelling begins with a specific research question or business challenge. The more cohesive your research framework, the more actionable your data strategy will be.


It’s common for creative and analytical teams to feel like they’re speaking different languages. Engineers and art directors may feel like they have very little in common — but there is one thread that connects these separate business arms. It’s data.

In addition to harnessing big data, organizations need to focus on building a data-driven culture. Every single team needs to speak in the same language to rely on the same metrics and systems.

This process requires an investment in education, change management, and culture. You need to build your data-driven culture from the inside out.


The success of an organization’s data strategy depends on the ability to drive action. Focus on aligning bottom-line metrics to top-line goals. Predictive analytics, algorithms, and data centers will be the ropes that connect these core objectives together. Your company’s data strategy should be developed cross-functionally — from the perspectives of multiple teams.

Five Data Mining Techniques That Benefit Your Business

Data mining is a buzzword that often is used to describe the entire range of big data analytics, including collection, extraction, analysis and statistics.

There are many different types of analysis that can be done in order to retrieve information from big data. Each type of analysis will have a different impact or result. The type of analysis you run depends on the type of business issue you are reseaching. Different analyses will deliver different outcomes and as result provide different insights. One of the common ways to recover valuable insights is via the process of data mining. Data mining specifically refers to the discovery of previously unknown or developing patterns, unusual incidents or interdependencies. Therefore it is important to have a clear understanding of what data mining is before you develop your big data strategy. 

The most important objective of any data mining process is to find useful information that is easily understood in large data sets. There are a five Techniques that apply to data mining:

I Anomaly or Outlier Detection

Anomaly detection refers to the search for data items in a dataset that do not match a pattern or expected behaviour. Anomalies are also called outliers, exceptions, surprises or contaminants and they often provide critical and actionable information. An outlier is an object that deviates significantly from the general average within a dataset or a combination of data. It  numerically stands out  from the rest of the data and therefore, the outlier indicates that something is out of the ordinary and requires additional analysis.

Anomaly detection is used to detect fraud or risks within critical systems and they have all the characteristics to be of interest to an analyst. It can help find occurrences that could indicate:

  1. Fraudulent Actions, example detecting various types of credit card fraud
  2. Flawed Procedures, example: Health Insurrance Companies read more
  3. Erronuous Theory in Certain Areas.
  4. Bad Data: In large Datasets, a Small Amount of Outliers is common.

II. Association Rule Learning

Association rule learning enables the discovery of interesting relations (interdependencies) between different variables in large databases. Association rule learning uncovers hidden patterns in the data that can be used to identify variables within the data and the co-occurrences of different variables that appear with the greatest frequencies.

www.insidedevices.orgAssociation rule learning is often used in the retail industry when finding patterns in point-of-sales data. These patterns can be used when recommending new products to others based on what others have bought before or based on which products are bought together. If this is done correctly, it can help your organisation increase its conversion rate.

Example Of Association Rule Learning  In 2004(!!) Wallmart, discovered that Strawberry Pop-tarts sales increase by seven times prior to a hurricane. Since this discovery, Walmart places the Strawberry Pop-Tarts at the checkouts prior to a hurricane. When you see Pop-tarts you know a hurricane is coming your way!

III. Clustering Analysis

Clustering analysis is the process of identifying data sets that are similar to each other to understand the differences as well as the similarities within the data. Clusters have certain traits in common that can be used to improve targeting algorithms. For example, clusters of customers with similar buying behaviour can be targeted with similar products and services in order to increase the conversation rate.

Best  Example Of Clustering Analysis: Social Media (read more in this article) whether it is facebook, linkedin, spotify, twitter and other social media related businesses all deploy (but not exclusively) Cluster Analysis

IV. Classification Analysis

Classification Analysis is a systematic process to obtain important and relevant information about data, and metadata – data about data. The classification analysis helps identifying to which of a set of categories different types of data belong. Classification analysis is closely linked to cluster analysis as the classification can be used to cluster data.

Example Of Classification Analysis:  Google Mail (Gmail) Google uses algorithms that are capable of classifying your email as legitimate or mark it as spam. This is done based on data that is linked with the email or the information that is in the email, for example certain words or attachments that indicate spam.

V. Regression analysis

Regression analysis tries to define the dependency between variables. It assumes a one-way causal effect from one variable to the response of another variable. Independent variables can be affected by each other but it does not mean that this dependency is both ways as is the case with correlation analysis. A regression analysis can show that one variable is dependent on another but not vice-versa.Regression analysis is used to determine different levels of customer satisfactions and how they affect customer loyalty and how service levels can be affected by for example the weather.

Example of Regression Analysis The website eHarmony uses a regression model that matches two individual singles based on 29 variables to find the best partner. Data Mining can potentially make you find the love of your life or predict your divorce..

www.insidedevices.orgData mining enables businesses, organizations, governments and scientists to find and select the most important and relevant information. This information can be used to create models that can help make predictions how people or systems will behave so you can anticipate on it.

The more data you have the better the models will become that you can create using the data mining techniques, resulting in more business value for your organisation.


  2. The New York Review Of Books
  3. The Piper Report
  4. Mashable