c07-idc-market-spotlight-3473

Big Data: What You Need to Know

Learn about big data characteristics and how you can use big data to make better decisions and improve productivity.

Big data definition

Big data refers to the 21st-century phenomenon of exponential growth of business data, and the challenges that come with it, including holistic collection, storage, management, and analysis of all the data that a business owns or uses. Big data can come from an indeterminate number and type of sources, including data generated by employees, customers, partners, machines, logs, databases, security cameras, mobile devices, social media, and more.

What is big data technology?

Big data technology encompasses the solutions, systems, and tools used to manage and realize value from big data. Big data technology is defined by its ability to perform data management actions at very high scale: to transform, ingest, integrate, and prepare extremely large volumes of data so that it is available for use in analytics and in other enterprise systems.

What are the characteristics of big data?

Big data is characterized by at least one, but usually all, of the following characteristics: massive volume, high velocity (rate of change), widely varied type, and unpredictable veracity. Together, these are known as the Four Vs of Big Data.

  • Volume: As the name implies, big data is of such massive amounts—terabytes, petabytes, or even zettabytes—and growing at such a rate that calculating its precise size is impossible.

  • Velocity: The vast majority of modern data changes constantly, and the rise of streaming data from IoT and other sources is only increasing the rate of change and growth. For example, stock market prices change many times per second. Big data technology addresses the challenges of capturing and analyzing data that is in constant flux.

  • Variety: Big data encompasses any and all types of data, regardless of how or where it was created. This includes structured data, such as credit card transactions and customer data, as well as unstructured data, which includes anything that doesn't fit neatly into a relational database: email, video, audio, images, social media, and so on.

  • Veracity: Data must be trusted to be useful. Big data technology addresses the need to verify the quality and reliability of enormous amounts of data streaming into systems at high speed from multiple sources, in multiple formats.

Many organizations now add a fifth V to this list: Value. Big data has immense amounts of potential value if it can be correctly managed and shared so that workers can interpret it, analyze it, and use the resulting insights to make accurate, confident decisions.

What are common big data challenges?

  1. Constant change: Big data is a constantly moving target. As the amount of data being generated keeps increasing and real-time streaming data sources become more common, managing big data becomes more challenging. Your systems must be able to ingest, integrate, and analyze data at high scale and high volumes to reap the most value from big data.

  2. Delivering competitive advantage: Big data is increasingly a competitive issue. As more companies invest in and successfully generate business value from their big data, those companies that do not keep up will be at great disadvantage.

  3. Currency: Big data technologies and services are evolving rapidly. IT organizations must stay alert to new innovations, trends, and opportunities as more data moves to the cloud and hybrid environments become common.

  4. Maintaining quality: Democratizing big data for self-service analytics requires delivering more data to more users. Ensuring trust in the data and compliant use of the data requires creating and deploying policies and rules for quality, access, and protection. By embedding governance into the activities that data engineers and data analysts use to create a data pipeline, you can ensure quality results, maintain privacy compliance, and increase trust in decisions made from big data.

Big data demands big data quality to ensure that information is relevant, timely, and trustworthy.

What are the main benefits of big data?

Big data allows companies to analyze a significantly larger data set and develop more comprehensive insights into preferences, patterns, and trends about anything from customer relationships to supply chain operations. Companies that successfully leverage their big data can:

  • Boost business efficiency and agility by reducing analytics time and supporting faster decision-making

  • Increase productivity through big data tools that enable greater data access for analysts and business users, as well as allowing users to analyze more data more quickly and share their insights across the organization.

  • Enable KPIs to help business and IT align their efforts and strategies

  • Improve customer experience by providing insights that enable more effective customer retention, more positive customer interactions, and more accurate marketing campaigns

  • Provide real-time information for machine-learning and AI-driven projects such as threat analysis, allowing companies to identify risks and security threats faster so they can mitigate them sooner

How do I use big data?

In today’s data-driven economy, your business success depends on deriving better analytics insights from big data, faster. Business users want detailed insights into customers and products, optimize pricing, increase revenue, and reduce costs. Data scientists need more data in order to develop more accurate prediction models to help business users with forecasting and trend analysis.

To tap into the full potential of big data, you need an enterprise architecture that’s capable of serving two distinct purposes:

  1. Make big data ready for analysis in a lab environment where analysts can efficiently run meaningful experiments and pilots.

  2. Make big data production-ready in a factory environment so it can be used for specific projects and products, as they’re being operationalized.

The good news is that the requirements for these two purposes can be met by using a common set of data management standards and technologies, available through a unified and intelligent data platform powered by AI. The infrastructure should support key capabilities like fast and scalable big data ingestion and integration; self-service and automation; data preparation; collaborative data governance, and big data privacy and protection. It must support multi-cloud, on-premises, and hybrid environments. And it should be able to support continuous integration, delivery, and deployment, which optimizes DevOps and DataOps to meet users’ demands for deeper insights.

You must also be prepared to manage streaming data—a significant component of big data. By 2025, IDC predicts that the Global Datasphere will grow to 175 zettabytes—and nearly 30% of that data will be real-time, created in part by connected users who will have a digital interaction about once every 18 seconds.¹ The billions of connected IoT devices are expected to create more than 90 ZB of data in 2025.

To learn more about how to build a big data management architecture, read “From Lab to Factory: The Big Data Management Workbook”.

How are companies using big data?

 

¹ IDC White Paper Sponsored by Seagate, “Data Age 2025: The Digitization of the World From Edge to Core,” November 2018, https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf