What Is Data Integrity and How Can It Be Protected?

It's easy to define what data integrity is, but it takes rigorous application of the right tools to make sure data stays accurate and safe.

After reading this, you'll understand:

  • Physical data integrity relates to the protection of stored information from things like natural disasters, power outages, and data corruption.

  • Logical integrity relates to maintaining data integrity by keeping it unchanged throughout its lifecycle.

  • Inaccurate data can hurt businesses, but data integrity also is essential to preserving accurate historical photos and accounts.

After reading this, you'll understand:

  • Physical data integrity relates to the protection of stored information from things like natural disasters, power outages, and data corruption.

  • Logical integrity relates to maintaining data integrity by keeping it unchanged throughout its lifecycle.

  • Inaccurate data can hurt businesses, but data integrity also is essential to preserving accurate historical photos and accounts.

Data integrity refers to the reliability of a data set, especially as it's maintained over time. Inaccurate data and missing data, which can result from poor data integrity, can cause endless problems. The woes include lost revenue, missed opportunities, and irreversible reputation damage. Data integrity can be damaged by poor data entry processes, incomplete or duplicate data, inadequate data-recording infrastructure, and a lack of standardization.

Web 3 offers a new way to enhance data integrity. For example, the Starling Lab, a joint project between the USC Shoah Foundation and Stanford's Department of Electrical Engineering, uses decentralized technologies like Hedera Hashgraph to build tools to ensure the integrity of digital media information.

In this article, we'll dive into the types of data integrity, the threats they face, and how distributed ledger technology (DLT) can enhance data integrity.

Types of data integrity

To preserve integrity, data must be collected free of human errors and then protected in its storage. In this section, we'll look at the two types of data integrity.

Physical integrity

Physical data integrity relates to the protection of stored information. Natural disasters, power outages, and data corruption can lead to lost data items. In fact, an organization's data could be lost forever if it isn't properly backed up.

Logical integrity

Logical integrity relates to maintaining data integrity by ensuring it remains unchanged throughout its lifecycle. As different parties use data, it may get changed or misrepresented. Logical integrity can be compromised by human error, through transfer errors, or by someone acting with malicious intent.

Examples of how integrity is broken

  • Human error. Humans make mistakes. While this is a natural part of life, it can drastically affect data quality. This integrity issue can arise from people committing a transfer error, deleting rows in a spreadsheet, misunderstanding a report while entering data, and putting a decimal point in the wrong spot.

  • Formatting errors. As information is moved from one system to another, differences in formatting may lead to changes in the data items.

  • Data breaches. If unauthorized parties gain access to your data, they can change it without anyone knowing. Also, disgruntled employees with proper access can damage data.

  • Hardware issues. If an organization relies entirely on physical hardware to store data, it runs the risk of the hardware failing, which may compromise integrity.

  • Data collection errors. Collecting complete data is one of the easiest ways an organization can ensure data integrity. Incomplete data samples can skew your analysis and may lead to bias.


Protection of data integrity

Data security and data integrity are closely related, because security measures can prevent data from being compromised. Data security methods such as access controls and encryption can help prevent unauthorized access to systems and data, thereby protecting data integrity. DLTs are well-suited to protect data due to their encrypted, immutable nature.

In this section, we'll dive into some novel ways DLTs are used for data security.

Theom

Theom provides individuals and businesses with a scalable, low-cost way to verify data integrity. The Theom platform protects enterprise data, even if it's stored across different cloud providers and SaaS applications.

Theom uses the Hedera Consensus Service (HCS) to ensure that any action taken on a business's data is legitimate.

SAFE

SAFE utilizes the Hedera network to log immutable healthcare customer data. It uses a data-stamping "Hashlog," developed by Acoer, to log an individual's health status in real-time.

During the COVID pandemic, this technology was used for logging contact tracing, exposure notifications, vaccination status, and other essential, sensitive data.

Hala

Hala lets civilians share and store video and photo evidence of potential emerging conflicts. According to a Hala whitepaper: "when an image is captured in the field using the Hala Systems App, [it] is added to a centralized file storage in Amazon Web Services ... and is simultaneously written to a private blockchain using the IBM Blockchain Platform, and Hedera Consensus Service. ... With these two sets of records, any 3rd party can verify the information on the public ledger."

Hala's security platform illustrates the importance of data integrity. If this type of information is mishandled, misrepresented, or hacked, it could cost people their lives and allow possible humanitarian crises to go unexposed.

AVC Global

AVC Global created a blockchain-based platform called the Track and Trace Program to address threats to the pharmaceutical supply chain. Complex supply chain systems can help lead to substandard drug quality and drug counterfeiting. AVC Global uses DLT to validate and record transactions across pharmaceutical supply chains.

Each drug is given a GS1-compliant serial number, which is recorded and verified by Hyperledger and HCS. The serial number is then scanned at each stage of the journey, with each transaction being verified through a non-corruptible, dual notary system.

Acoer

Acoer addresses drug overproduction and underproduction by tracking and logging pharmaceutical supply chain events from end to end. Acoer provides public data visualization tools, letting drug companies make informed production decisions.

Everyware

Everyware uses IoT devices to track and record essential data relating to pharmaceuticals. For example, Everyware partnered with the United Kingdom’s National Health Service to track and log COVID-19 vaccine storage information. Since COVID-19 vaccines require colder temperatures than other vaccines, Everyware used IoT devices to ensure adequate storage temperatures.

This project uses the hashgraph consensus algorithm to log immutable records for essential healthcare data.

Neuron

Neuron is creating a shared data platform for drone devices. Using ground-based sensors, Neuron records real-time flight events, ensuring pilots know when other drones are in the area. This visibility allows pilots to fly their drones without fearing a collision.

Neuron uses the Hedera network to keep its real-time data secure. Hedera's low fees make it an ideal solution due to the large volume of recorded real-time data.

What is data integrity with Hedera?

Businesses need to maintain data integrity, but it is essential in many areas. Hedera's work with Project Starling focuses on preserving the accurate presentation of history. Inaccurate data can lead to severe consequences for businesses and customers. However, the value of preserving data integrity for historical photos and news accounts can't be measured.

DLTs are a viable solution for many of the potential causes of lackluster data integrity. These immutable ledgers offer companies a secure way to store data without fearing physical integrity issues. Additionally, data can be automatically recorded and logged in a decentralized manner, eliminating human error, invalid data, and storage erosion.

Many data security projects turn to the Hedera network for its secure ledger technology and low, predictable fees.