Small Data is a term that describes data sets with fewer than 1,000 rows or columns. The Covid-19 pandemic has spurred the growth of small data.
It is next to impossible to have a surprise element in today’s conventional warfare. Social media chatter from anxious Russian girlfriends, spouses, and relatives have revealed everything from large scale troop deployments to military and naval hardware mobilizations.
While the posts give no specific indication of whether President Vladimir Putin has decided to launch a new military offensive targeting Ukraine, they serve as evidence that troops and equipment are being moved en masse from Russia’s Far East and offer a rare glimpse into the fears voiced by relatives of the soldiers.
Social chatter analysis
Back in December 2019, analysis of social media chatter of doctors in Wuhan had led the Canadian outbreak analysis company, Blue Dot, to identify a mysterious illness that was spreading rapidly. It was almost two months before the World Health Organization sounded the alarm of the Coronavirus pandemic. These are just a few instances of small data analytics.
Small Data is a term that describes data sets with fewer than 1,000 rows or columns. The term was coined in 2011 by researchers at IBM to describe datasets that are too small for traditional statistical methods. In contrast to big data, small datasets can be analyzed using estimation. Examples of small datasets include customer transactions, social media posts, and individual genome sequences.
Customer data, such as booking information, meals bought, turnover per seat, and seasonal variations in customer flow – all of which can be easily accessible. A Copenhagen restaurant grew its turnover from $1.1 million to $6.1 million within two years whilst depending on small data insight.
Pandemic spurs small data growth
It was the once-in-a-century event of the pandemic that led to the growth of small data analytics. Past data simply wasn’t of any use in a situation that did not have any precedence. The retail algorithms failed to forecast that chain stores would face a critical shortage of toilet paper in the initial days of the lockdowns. According to a Gartner Press release, 70% of organizations will shift their focus from big to small and wide data by 2025.
The analyst firm reported that the extreme business changes from the Covid-19 pandemic caused Machine Learning (ML) and Artificial Intelligence (AI) models based on large amounts of historical data to become less relevant. Running parallel to that is the fact that decision making by humans and AI are wider and require data from different sources for accurate responses to queries.
As a result, organizations adopted technologies that can use whatever data is available, as well as, wider sets of data that enables the analysis and use of synergy of a variety of small and large, unstructured, and structured data sources, as well as small data which is the application of analytical techniques that require less data but still offer useful insights.
Big innovation & small data
Small data will play a prominent role in analytics as the future continues to be fluid and bear little correlation with the past. The organization will have to forecast more near-term scenarios to create agile strategies to stay in step with highly unpredictable times ahead. Most of the scientists of the 19th and 20th centuries used small data for discoveries.
Scientists made all the calculations by hand by using small data. They discovered the fundamental laws of nature by compressing them into simple rules. It was found that 65% of the big innovations are based on small data. Small data can also be used to make some important conclusions especially when it comes to training an AI. Huge data can create confusion in machine learning methods. AI is all about mastering knowledge and not processing data. It involves providing knowledge to the machines to make them perform any task.
Small data approaches like transfer learning are widely being used nowadays. Transfer learning is the reuse of a pre-trained model on a new problem. It’s currently very popular in deep learning because it can train deep neural networks with comparatively little data. Scientists use transfer learning to train machines to enable them to work in various fields. For example, some researchers in India used transfer learning to train a machine to locate kidneys in ultrasound images by using only 45 training examples. Transfer learning is expected to grow more soon.
One of the major challenges in the use of AI is that machines require generalization i.e., to provide proper answers to questions in which they are trained because transfer learning is transferring knowledge. It is possible to even with limited data. Transfer learning is being used for the diagnosis of cancer, playing video games, spam filtering, and many more. Advanced AI tools and techniques are opening new probability to train AI with small data and change processes. For training an AI or machines, large organizations are using thousands of small data.
Small Data calls for Casual AI
Small data calls for more tailor-suited AI systems, too. Causal AI represents the next frontier of artificial intelligence. This technology has been developed to reason about the world in a similar way to humans. Whilst we can learn from extremely small datasets, causal AI has been developed to do the same. Causal AI is the only technology that can reason and make choices like humans do. It utilizes causality to go beyond narrow machine learning predictions and can be directly integrated into human decision-making. It is the only AI system organizations can trust with their biggest challenges – a revolution in enterprise AI.
Technically speaking, causal AI models can learn from minuscule data points owing to data discovery algorithms, which are a novel class of algorithms designed to identify important information through very limited observations – just like humans. Causal AI can also enable humans to share their own insights and pre-existing knowledge with the algorithms, which can be an innovative way of generating circumstantial data when it doesn’t formally exist.
In business terms, this means that casual AI algorithms can be fed small data across a range of different sources to identify recurring themes that typical augmented reality would be unable to address. As the technology continues to emerge, we’re likely to see casual AI identify more consumer insights for marketers through the wealth of information businesses generate across a range of touchpoints. This can breathe new life into small data models and equip businesses with a more manageable approach to organizing their data in the future that may offer fewer insights into the behaviour of consumers.
(Abhijit Roy is a technology explainer and business journalist. He has worked with Strait Times of Singapore, Business Today, Economic Times and The Telegraph. Also worked with PwC, IBM, Wipro, Ericsson.)
(Disclaimer: The views expressed in the article above are those of the author’s and do not necessarily represent or reflect the views of Autofintechs.com. Unless otherwise noted, the author is writing in his/her personal capacity. They are not intended and should not be thought to represent official ideas, attitudes, or policies of any agency or institution.)