Best practices for collecting consumer data

The recent scandal involving Cambridge Analytica and Facebook highlights a major ethical dilemma generated by rapid use of technology in our lives. Companies surrounding us are continually collecting the information about us and using it to drive specific behaviour in us without our knowledge. While most of these activities are limited to pushing offers and promotions on digital channels, it appears like Cambridge Analytica had undoubtedly crossed a line by using it to manipulate voter’s decisions.


It is true that most major organisations have interest in collecting information about customers and have harnessed it by deploying sophisticated algorithms to generate profits. There are many examples when these activities have made life better for today’s digital consumers. For example:

  1. Amazon uses the buyer’s data to provide the recommendations for next product that they are likely to buy. It is evident that if people buy more products, Amazon benefits from it. But it is equally valid that this approach helps customers find products with ease without having to browse through tediously for hours.
  2. Google uses the email data to classify them as spam or not-spam. Some people would consider it as an invasion of privacy. However, this approach makes our lives much better because a highly accurate and quick machine is pre-screening our emails so that we get only the relevant content.
  3. It is important to point out that even Obama used sophisticated data analytics to predict which voters are at the brink and used his resources to directly talk to these people and convince them about his presidency.


So if Obama himself used a similar approach, why a similar approach seems scandalous during Trump campaign? There are several reasons:

a. Companies related to the campaign collected the data without the consent of the people. They used the data for an application which was never agreed upon.

b. They abused the default privacy settings of Facebook to collect more information from the network and communities without their explicit permission.

c. Companies related to the campaign distributed the data to the third parties who ran digital ads without attributing it to the origin of the data.

This kind of ill-treatment of the data and blatant carelessness in running marketing campaigns may have worked well for their campaign but is the reason why this approach is so wrong. The data should be collected by fair means and should be used for the specific application for which the consumer has agreed. The consumer is putting their trust in the brands, and it is an ethical responsibility of the brands to protect the interests of consumers, most of whom do not understand the technicalities behind.


In light of this brewing storm, I have decided to compile a list of best practices that should be followed in collecting, analysing and using the consumer data:

  1. All consumer data should be kept in a private and encrypted database on the servers of brands. This should not be shared between third parties or the public.
  2. All the public data collected should comply with the policies of individual websites from which they are being obtained. For example – Facebook shares specific parameters of the profiles public so that they can appear on the search engines. Any attempt to gather more information may not be ethical.
  3. The data collected should only be used for the application on which it was received. For example, if users are leaving their data on an e-commerce website, it should only be used to push purchases from the e-store.
  4. The terms and conditions for collecting information should be made clear to the consumers, and their consent should be taken to use their data in future.
  5. Any marketing activity done from the data should have a valid and verifiable source.


– Anmol Mohan

CEO, Tuple Technologies

P.S.: Tuple provides Big Data Analytics and Artificial Intelligence platform as a service. Our objective is to accelerate and spread the use of data analytics to enable accurate and optimal decision making. Due to the nature of our business, we do help our clients collect, analyse and use the data. However, we have a commitment to protect the consumers from any kind of harassment and manipulation. We never engage in any unethical activity related to data collection or usage for which consumers have not provided an explicit consent.

P.P.S.: All the opinions provided in this article are my own and is not meant to be a proof or guidelines for any specific company or activity. All facts presented in this article are based on publicly available media coverage, and I have tried my best to be factually accurate


The costs of not doing data analytics – Part 1


Implementing an insight-generating system or data-driven strategy for your business is viewed as a large, complex project on par with rolling out a Database management system or ERP system. Since prior systems do not run mission critical processes as the latter for the company, they are often delayed due to other priorities.

However, if we go by the recent developments, it is hard to ignore the progress that Big Data and Artificial Intelligence has made. Not only these technologies are making life easier, but their case studies of adding enormous value to the organisations are also all around us. Research shows that companies with data-driven strategy achieved 6% better profits and 8% better productivity over their not-so-nerdy competitors. So despite the obvious benefits, why do most companies think of data analytics as an optional process? Here are some key challenges:

Get your priority straight

If you are a product manager, you are focussed on improving features of products which will appeal to your customers. If you are a marketing director, you would like to communicate right messages to your customers. Much data is generated through these processes, but that in itself is worthless. Chances are you are already stretched and have so many other priorities, that looking at data sits right at the bottom

Screen Shot 2017-05-02 at 3.28.40 PMAccording to MIT Sloan’s report about Analytics – 50% of the companies are still analytically challenged despite high optimism towards the value driven by analytics.

Barriers to Adoption

It takes much hard work to get the required information out of the data. Many companies especially small to midsize firms place emphasis on the price of analytics, like the cost of infrastructure, software pipelines and hiring specific resources. Too often, they stay away from ‘pricey analytics projects’, or that is what they perceive.

Screen Shot 2017-05-02 at 3.48.58 PM

According to a survey by IDG Enterprise, the top reasons to companies’ inability to implement data analytics projects are (a.) Lack of Skilled resources, (b.) Limited Budget and (c.) Legacy issues which make implementation difficult on current setups

Can nature be blamed?

Data Analytics is an aggregated science. It finds recurring patterns with a bird’s eye view, and humans are not good at visualising complex patterns. We would rather rely on our selective judgement than indulging in an overly cryptic, incrementally better and effort consuming system.


Check out this answer about why data-driven decision making is involved, by Ricardo Vladimiro, Game Analytics and Data Science Lead @ Miniclip

However, the value is apparent

As a result, data takes a backstage for most companies, and decision making becomes a pure art rather than science. It is true that the process of doing analytics is a tedious and costly affair, to begin with. However, surveys have shown that making data-driven decisions can generate substantial value for the companies which is similar to primary functions of the organisations. Even better is the ROI which scales very well and you do not need to invest in data like you constantly have to with other functions.


According to this report by McKinsey, data and analytics has driven 60%+ increase in net profit margins and 0.5-1% growth in annual productivity for US Retail. Similar observations were made in EU as well.

Debunking the myth of gut-based decision making

Decision making is nothing but forecasting of events. As in charge of delivering success to your company, you think and decide – What would appeal to your customers? What would drive them crazy? What will give you the best return on costs yet hitting the set targets? I decided to write this article because I think it will drive certain behaviour in the readers. I forecasted.

So now that we have established what decision making is, I want to quote a super book on the super subject called ‘Superforecasting’. The initial parts of the book are mostly about establishing the fact that being good at forecasting is not something that we are born with; it is a skill learned through a painstaking process of gathering information, analysing it and finding something useful (sounds familiar?). In an example they quote that:

“A researcher gathered a big group of experts – academics, pundits, and the like – to make thousands of predictions about the economy, stocks, elections, wars and the other issue of the day. Time passed, and when the researcher checked the accuracy of the predictions, he found that the average expert did about as well as random guessing.”

To achieve the optimal, balance is required where the experience based decisions can be backed by insights from the data. According to this slightly old PwC survey:

Highly data-driven companies are three times more likely to report significant improvement in making big decisions.

So how can analytics add intelligence to your decision making? (a.) It can tell you where are you losing money and, (b.) It can find out alternate (often hidden) revenue streams for your business. We will cover both of them in Part 2.


Talk BIG with Tuple

As the world of business starts to see potential in big data business, Tuple Technologies flags off their series of Talk BIG with Tuple – A series of interviews with people involved in the business of Big Data & the businesses who use Big Data for a better leverage over their competitors. TBT’s inaugural episode will feature Mr Asankhaya Sharma, R&D Director of SourceClear, a software security solutions company. Mr Sharma is a PhD. from NUS, Singapore and he did his graduation from NIT, Warangal in India. He has about 10 years of experience in various domains of computer science. Also, Mr Sharma has constantly been in the advisory role for many a startups. We got the chance to discuss with him on Big Data & Analytics, and how it is changing today’s industries and markets. Below is the transcribed version of the conversation,

Me: Thank you for taking the time to do this Asankhaya. Could you please tell us a bit about your professional journey?

A: Sure. I started my career with Microsoft as a Software Engineer in Hyderabad, India, after my graduation in computer science & engineering. During my Microsoft days I had a stint in the Microsoft Research Center at Bangalore for about a year where my inclination towards research grew. I enrolled for a PhD. program in NUS, Singapore in Computer Science. I have been in the role of a mentor for many startups both during my PhD and after receiving my doctorate. I am quite passionate about teaching and have given sessions in Singapore Institute of Technology (SIT). My present role is of a Director taking care of the R&D department at SourceClear.

Me: What is SourceClear?

A: Well, the way we build softwares has dramatically changed in the past few years. Any web application that is developed today has less than 10% of customized code or business logic. Most of the functionalities of the application reside in third party libraries which are mostly open source. Now, the vulnerabilities involved with such open source libraries is that, when a particular library which is popular gets attacked, a huge number of applications involving that library could be attacked. This is where SourceClear comes into picture. SourceClear builds tools for developers to use open source safely. The tools integrate into a developer’s workshop helping them to come up with better protection for their applications.

Me: How do you see the rise of data analytics in Singapore? Does the Singapore market value data and invest on data analytics companies?

A: I say data analytics is already present in some form in a majority of companies in Singapore. People are predicting an increase in the number of jobs in data analytics and data driven decision making over the next few years in Singapore. In fact, IBM has come up with a masters program in the NUS which has been running for the last couple of years and they are planning to start even a bachelors course from next year. As the consciousness on data collection for further analysis keeps increasing, technology joins hands alongside to make use of the analysis for making a better and smarter nation. Presently, the trend towards data analytics might not have much prevalence but, Singapore as a country is surely among the top in both valuing data and investing in them.

Me: There is a psychological barrier among businesses with respect to sharing data, even if it is bringing an incremental value to them. What do you think is the best way to address this issue?

A: In our line of business, we face this issue quite regularly. Most of the critical analysis on source codes of businesses needs to be done locally as the client is skeptical about any kind of cloud infrastructure. One of the main reasons why our platform is based on Spark is because we have service obligations with our clients and none of their data could reside in any third party service. So, we had to build everything on our own and not rely on any other service. The only way to break this barrier is by showing that you can actually create a value and maintain the levels of security to clients. It is a slow process but, it will create the necessary credibility towards your business.

Me: Being in the world of technology for more than a decade now, what do you think should be the approach of businesses in the big data spectrum?

A: I believe there are 2 ways in which the whole idea of big data could be approached with. One, provide solutions horizontally. It means to provide analytics across domains by processing a certain kind of data. The other is to provide solutions to a specific domain on a variety of data which is the vertical way. So, you provide one solution which fits irrespective of the domain as the product you have developed specializes in analysing that particular kind of data. Alternatively, you assume the presence of a certain kind of data and analyze how that will affect a particular industry. For example, cars in the future will be a lot more automated but, how does it impact the insurance industry? Figuring our questions like these will give your product a better visibility in the market.

Me: Your thoughts on the challenges that Big Data companies face. Also, your suggestions for companies entering the field of big data.

A: The real challenge lies in asking the right questions at the right time. Companies might have a huge amount of data and data scientists to analyze this data but, if the company does not invest on people with domain knowledge they would never ask the right set of questions which would make them unique. Take the example of marketing, if people of different domains do not come together to analyze the collected data, they would not know what questions to ask and how to make sense of the data.

My advice to the companies entering the big data domain is to find that niche thing which will make you stand apart. A solution that’s faster or economical is not enough. The solution must also see those unique patterns in the client’s data and help in making decisions. Ultimately, that is what all of this about. you are trying to make something done and whoever provides a more efficient way stands out.

Me: Well, that brings us to the end of the inaugural episode of Talk BIG with Tuple. Once again, we thank Mr Asankhaya Sharma for taking the time to share with us his views on Big Data and Analytics. Next week, we will discuss the trends of the analytics industry in Malaysia with a senior data scientist from a business processing and technology service providing company.


Asankhaya Sharma, Director – R&D, SourceClear


Welcome to Tuple


It’s a beautiful Monday morning in Singapore with people bustling in buses and MRT stations making their way to work. Somewhere in Bugis, the owner of a Sago shop (a F&B outlet specializing in a delicious Chinese desert) looks at her last week’s business reports and wonders, “I wish I had the power to know my customers’ liking even before they ordered…”. At the same moment, on the other side of the town, a mango seller is loading up his truck hoping to find the right customers where he could get more value for his specially imported mangoes.

As the owner of the Sago shop opens her shop’s Facebook page the mango seller is checking out the orders received for the day. Although the number of likes have gone up in the FB page, she is not aware how a sentiment analysis could be done on the people who have liked the page. Also, if only she knew that her customers are already talking about the delicious mango sago they had last night over social media, she could come out with an offer on the particular to attract more footfall.

Similarly, the mango seller takes note of the fruit shop and juice shop outlets that have ordered. He is worried about the excess stock which he might have to undersell. There is no one to suggest him that he should consider the Sago shop as a premium customer and deliver an exclusive batch of mangoes to increase his value proposition. The solution to these problems might not be of data but, analyzing the already available information and finding the value in it.

Ladies & gentlemen, the above story is based on true incidents happened over 3 years. Now, both the characters of the story are involved in a strong business relationship and till date serve the best Mango Sago in town. Such common business issues have made us wonder whether holding information and not seeing a pattern in it to improve business is innocence or sheer negligence.

As the amount of data from the Global IP traffic branches out to exabytes and zettabytes, so is their value. Companies gain an edge over their competitors solely due to data sourcing and cleaning. Companies employing analysts to make sense of their data are moving ahead aggressively. Soon, there will be a world where data would be the universal currency and businesses would transact with their knowledge stored as data.

At this juncture, Tuple Technologies would love to share the opinions of people from a variety of industries and markets on what they believe is the value for making sense of the so called Big Data. So, moving forward, we will interact with statisticians, data scientists and analytics teams of various multi national organisations. This will be purely to understand the supply side, i.e. the analysts, on how they see this industry’s growth and how each market is responding to this game changing business segment.

Alternatively, we shall discuss with C-level executives of a plethora of industries to identify what is that they are looking from the information collected on their customers, how they are leveraging that information to gain an upper hand over their competition. We hope that the blog will bring out adrenaline pumping, mind boggling revelations.