Data Breach and Leak Glossary: Every Technical Term Explained

Tom Read Updated on July 27, 2023 Managing Editor

Table of Contents

Key definitions
Data storage and cloud infrastructure
Threats, risks, and impacts
Encryption
Government agencies and legislation
The Bottom Line

We’ve created a glossary of cybersecurity terms to help you better understand the technical subjects covered in our data breach reports and blog content.

We cover an array of subtopics, including (but not limited to) hacking, the biggest threats, and the government agencies responsible for protecting your information. But first, let’s take a look at some key definitions.

Key definitions

Access

The ability to communicate or interact with a system, use its resources, handle its data, and/or control its components and functions. A computer accesses the internet through its web browser, modem, and internet service provider.

Access control

A set of security policies and processes that include authentication and authorization, and grant or deny requests to access company data and resources, or enter physical facilities. Access controls may ask an employee to use a password to authenticate their identity when signing into a computer, and also limit the files that employee is authorized to use.

Authentication
The process or action of verifying the identity (or other attributes) of a user, process, or device.

1. Two-factor authentication (2FA)
Authentication that uses two separate processes to identify an individual. For example, a service might ask users to enter their password along with a code sent to their mobile device to gain access.

2. Multi-factor authentication (MFA)
Authentication that uses two or more separate processes to identify an individual.
Authorization
The security process by which a server determines the level of access or privileges a user is granted to system resources (i.e. files, services, computer programs, data, and application features).

Advanced persistent threat

A malicious actor with significant resources and expertise that can levy multiple attack vectors (i.e. phishing, malware, insider threats) to achieve its goals.

Antivirus
A security program that detects and removes malicious software on a device or computer network.

Attack
Any attempt to gain unauthorized access to a system, its resources, its data, its controls, or to disrupt equipment operation.

Attack surface
The sum of all possible points (attack vectors) in a system environment that an unauthorized user can attack to gain access to data.
Attack vectors
A method of attack or pathway that a hacker uses to gain unauthorized access to a computer, system, or network and exploit security vulnerabilities. Malware, social engineering, text messages, and web pages are all examples of attack vectors.
Cyber attack
Malicious efforts to damage, disturb or gain unauthorized access to a computer network or system through the use of one or more computers.
See also:
Brute force attack
Denial of service (DoS)
Distributed denial of service (DDos)
Doxing
Identity theft
Malware
Phishing
Social engineering
Spam
Supply-chain attack

Attacker
A malicious individual, group, organization, or government that exploits a computer system or network to alter, destroy, steal, or damage its data, or disrupt its operation, and benefit from the outcome.

Authentication
See Access control.

Authorization
See Access control.

Certificate
A unique and digitally signed file that acts as an electronic password, identifying an individual or organization and allowing the secure exchange of data online. Websites and applications are given security certificates by trusted third-party certification authorities, including IdenTrust, DigiCert, and Sectigo.

Credentials
A user’s authentication details that are needed to verify their identity. Could include their password, username, token, or certificate.

Cyber attack
See Attack.

Cybercriminal
A person who commits crimes by using or targeting a computer, computer network, or networked device.

Cyber incident
The breach of a system’s security policy in an attempt to either use, control, or gain unauthorized access to that system or its data, or affect its integrity or availability in some way.

Cybersecurity
The protection of electronic devices, services, or networks, and their data, from unauthorized access, theft, or damage.

Database
An organized repository of structured data or information, often held in a computer system and accessible in different ways.

Relational database
Databases storing information in tables that can be linked (i.e., tables that can be related) to one another based on common data. This means a single query could produce a new table of data, allowing for greater insights and better informed decision making.
SQL database
Also known as an SQL server database. An SQL database is a type of relational database built using SQL. It’s the programming language most used for databases largely due to its ability to retrieve multiple records easily.

Data breach
A security incident in which data, systems, or networks are accessed or affected in an unauthorized manner.

Note: Some people use the terms data breach and data leak synonymously to mean any incidence of exposed data.

Data leak
A security incident in which sensitive or personal data is accidentally disclosed physically, online, or in another public environment due to poor security practices or storage misconfiguration. A data leak could occur when someone loses a hard drive or leaves a database unsecured.

Note: Some people use the terms data breach and data leak synonymously to mean any incidence of exposed data.

Data loss
The unintentional or accidental deletion, corruption, or exposure of data, or the act of forgetting where data is stored.

Data privacy
See Data protection.

Data protection
The process and practices involved in protecting critical information from corruption, compromise, or loss, and restoring data to a functional state should it become inaccessible or unusable. Data privacy and data security are both components of data protection.

Data privacy
The responsible governance and proper handling of an entity’s personal, sensitive, and/or confidential data, and the right of individuals to control how their personal data is collected and used.
Data security
The defense of digital information from malicious or accidental threats that could misuse, expose, delete, corrupt, or provide unauthorized access to data. Includes data security technologies and methodologies, such as access control.

Data security
See Data protection.

Data server
A computer, software program, or platform used for database services, such as information storage, processing, and security.

Data theft
The act of intentionally stealing information.

Digital Footprint
A trail of data about a user that is left behind as a result of their online activity.

Exposure
The state of being unprotected and allowing access to data or capabilities that an attacker could use to gain unauthorized access to a system, network, and/or data.

Firewall
Hardware or software that limits or controls traffic to or from a network based on predetermined rules. Firewalls are designed to prevent unauthorized access to or from a network.

Hacker
Someone who uses their knowledge of computers, information technology, and cybersecurity to find and exploit security vulnerabilities in computer systems and/or networks.

Black hat hacker
Hackers who exclusively work to cause damage for personal gain. They are unconcerned with laws and unbounded by ethical codes, and use their technical skills to bypass security protocols, conduct cyber espionage, or employ malware, usually for profit. They engage in activities like selling data breaches on the dark web, encrypting data and holding it for ransom, or stealing information to extort money from their victims.
Gray hat hacker
Hackers who have neither the ethical code of white hat hackers nor the malicious intent of black hat hackers. Gray hat hackers use their technical knowledge to find and explore a system or network’s vulnerabilities, acting without the owner’s permission in ways that many would say is unethical and that is sometimes illegal. Unlike black hat hackers, they don’t exploit the vulnerabilities they find for profit. Instead, Gray hat hackers are likely to report vulnerabilities to system/network owners at first and may offer to repair vulnerabilities for a fee. They might escalate to full public disclosure if the exploit remains unfixed.
Hacktivist
Hackers who use their technical skills in displays of social, ideological, religious, or political messaging. They are usually unconcerned with the ethics of white hat hackers, and are often willing to break laws for the sake of what they see as a greater good. Hacktivists may deface or immobilize websites or steal and release data to achieve their aim.
Nation-state hacker
A hacker or hacking group that works on behalf of a government to disrupt and compromise, or access the critical data of, target governments, organizations, and/or individuals.
White hat hacker
Also known as ethical hackers, white hat hackers may work in research labs or for reputable companies where strict ethical guidelines are followed. Their work may include performing penetration tests or engaging in bug-bounty programs where their intention isn’t to cause harm, but to uncover weaknesses or vulnerabilities within a system or network so that they can be fixed.

Misconfigured
Poorly configured or insecure security controls that place systems and data at risk. For example, a database that’s configured without password protection is misconfigured.

Misconfiguration
An incorrect or substandard configuration of security controls that places data at risk; the state of security controls being misconfigured.

Multi-factor authentication (MFA)
See Access control.

Penetration testing
A method of evaluating the cybersecurity of a network and/or system by intentionally attacking the network/system to uncover vulnerabilities, before the assessors relay this information to the network/system administrator.

Personal data
(1) In the EU’s General Data Protection Regulation (GDPR), personal data is any information that relates to an identified or identifiable person, either directly (the data itself) or indirectly (when used with other data). This includes Personally identifiable information (PII), along with various other details such as photos and preferences. (2) In other jurisdictions, personal data simply means PII.

Sensitive personal data
In the EU’s GDPR, sensitive personal data encompasses the most critical forms of personal data, including health-related information, ethnic origin, political views, religious views, sexual preferences, etc. Sensitive personal data could cause the greatest harm if exposed, and GDPR protects it with stricter controls regarding its collection, processing, use, and storage.

Note: The definitions of Personal data (1) and Sensitive personal data are taken from the EU’s General Data Protection Regulation (GDPR). These terms may be interpreted differently in various jurisdictions around the world.

Personally identifiable information (PII)
Information that can identify an individual when used alone or alongside other relevant information. Includes data such as name, address, date of birth, social security number (SSN), and banking information.

Responsible disclosure
A process that allows security researchers to safely and efficiently report vulnerabilities in applications and IT infrastructure to the relevant organizations or individuals.

Security policy
A set of rules governing how an organization and employees properly access data, resources, and assets to minimize the risk of security threats and protect those assets.

Sensitive company data
Information that poses a risk to the company if exposed to another company or the public. Includes intellectual properties, trade secrets, business plans, and more.

Sensitive personal data
See Personal data.

Server
A computer or computer program that shares data, resources, programs, or services to another computer and its user (i.e. the client).

Structured query language (SQL) code
A programming language, characterized by simple, directive statements that helps users operate relational databases and manage stored information. SQL ensures fast, easy, and accessible insertion, deletion, or retrieval of data.

Structured query language (SQL) database
Also known as an SQL server database. See Database.

Token
A token is an object that represents the right to perform a specific action. Tokens can be either software or hardware. For example, authentication tokens can be hard tokens, which are physical security devices that unlock access to a resource or facility, or soft tokens, which are software on electronic devices (e.g. single-use authentication codes).

Two-factor authentication (2FA)
See Access control.

Unauthorized access
Any access to a network, system, application, data, or other resource that is not permitted and violates the security policy.

Unsecured
Not secure, safe, protected, or free from the risk of loss. A database is unsecured if it doesn’t adopt adequate security controls, such as password protection.

Data storage and cloud infrastructure

Amazon Web Services (AWS)
An Amazon subsidiary that provides remote cloud computing services and APIs (application programming interfaces). AWS services include cloud storage, computing power, and networking services.

Amazon Web Services’ Simple Storage Service (AWS S3)
Amazon Web Services’ on-demand public cloud storage service for individuals, companies, and organizations.

Application programming interface (API)
A software intermediary that allows two different pieces of code to communicate with each other. APIs process any data transferred from one program to the other based on defined rules to deliver the request.

AWS S3 bucket
A container for objects or public cloud storage resources in Amazon Web Services’ Simple Storage Service (AWS S3). AWS S3 buckets and their data have to be secured by their owner with security controls like authentication and encryption – a method of making data unreadable to unauthorized personnel.

Azure Blob storage
A cloud storage solution, operated by Microsoft, that’s designed for keeping large stores of unstructured data, such as text or binary data.

Cloud computing
A method of delivering computing services (i.e. networks, servers, storage, applications, and services) on-demand and over the internet.

Cloud storage
A cloud computing model in which a third-party service operates data servers and provides remote data storage and access capabilities to individuals, companies, and/or organizations on-demand and over the internet. Service providers typically secure the physical environment of the servers, but the owners of the data may be required to manage their own digital security.

Data point
A single unit of information. In business, it is a piece of information that has been measured (perhaps through user polling or data analytics) and can be represented on a graph.

Dataset
A collection of related data that can be managed as individual data points or a combination of data points.

ElasticSearch
A search engine that allows users to quickly store, search, and analyze large amounts of unstructured data – unprocessed information that’s stored in its native format. ElasticSearch is often used for storing real-time HTTP logs and software logs.

Google Cloud Storage
An online file storage service on the Google Cloud Platform that allows users to remotely store and access data. Use cases include data analytics, data backups, and media content storage and delivery.

Index
An index refers to a list of data that helps the user query a database. Much like an index in the back of a book, a database’s index is a lookup table that shows users where to navigate for the information they require. Indexes are typically written in plaintext and may show groups of files, or a list of database entries.

Microsoft Azure
A cloud computing platform, operated by Microsoft, that offers remote services for data analytics, data storage, networking, and more.

Structured data
Information that’s been arranged into a formatted repository (usually a database) and adheres to a predefined data model. Structured datasets have a persistent order to facilitate efficient data processing and analysis. Structured data might include names, addresses, phone numbers, and credit card information, for example.

Unstructured data
Data that isn't stored in a structured format. Unstructured datasets may have an internal structure that’s human or machine-generated, but this structure isn’t predefined by a data model. Unstructured data usually features formats that are not easy to store in a structured way, such as PDFs, images, and video files.

Virtual private network (VPN)
An encrypted network that creates a secure connection between the remote user and the internet. A VPN routes the user’s traffic through an encrypted server, making the connection unreadable to other internet users and masking the user’s IP address.

Threats, risks, and impacts

Adware
Software that automatically displays or downloads advertisements onto a user’s system.

Backdoor
A hidden entry point into a device or software that bypasses normal security measures, such as authentication. Developers might leave backdoors to help them quickly troubleshoot, fix issues, and regain control over applications or operating systems. Attackers can also exploit or create backdoors for themselves.

Bot
A computer program designed to automate tasks and repeatedly perform them over an extended duration is known as a bot. These bots typically mimic or supplant human actions but operate at a much higher speed. They can serve legitimate purposes, like customer service or web indexing, or they can be malicious, for instance, when bots are utilized to hijack computers for cryptocurrency mining and cybercriminal activities.

Botnet
A network of electronic devices infected by malware that are used to carry out cyber attacks without their owner’s knowledge.

Brute force attack
An attack that uses computational power to input a large number of different value combinations. Attackers often use this method to find out passwords and access systems or accounts. Attackers may also brute force URLs on a website to gain unauthorized access to hidden pages.

Denial-of-service (DoS)
An attack that overloads a server with traffic, denying legitimate users access to the service or resource.

Distributed-denial-of-service (DDoS)
A denial-of-service attack that uses multiple devices to overload a server with traffic.

Doxing
The act of revealing personally identifiable information (PII) about a person online without their permission. Doxing could include the disclosure of the victim’s name, home address, phone number, banking information, or other personal and private data or content.

Exploit
Software or code designed to take advantage of a software vulnerability or security flaw in a system. Also refers to the act of attempting to breach a system’s security without authorization.

Identity theft
Using another person’s name, personal data, and other identifying characteristics to commit fraud. Identity thieves may apply for credit, file taxes, or purchase medical services in another person’s name.

Insider threat
A security risk originating with a person or group with authorized access to an organization's assets. The behavior may be accidental or malicious, but their misuse of access causes damage to the organization, its resources, data, personnel, facilities, equipment, networks, or systems.

Keystroke logging
Or keylogging. The process of using malware to record every pressed key on a user’s keyboard. Keystroke logging is commonly used to obtain users’ plaintext login credentials and credit card information.

Malicious code
Program code designed to execute unauthorized functions that will negatively impact the confidentiality, integrity, or availability of a system or data stored on a system.

Malware
Malicious software or code designed to damage or exploit electronic devices, computer systems, or computer networks.

Ransomware
Malware that blocks access to data or a computer system until the victim pays a fee.
Spyware
Malware that gathers information about a person or organization without their knowledge and relays that information to another entity.
Trojan
Malware or a virus that hides within legitimate software to infect the victim’s device.
Virus
A form of malware that can self-replicate once deployed inside a system to spread to other legitimate files, programs, and systems. Viruses can cause damage to host systems and can lead to data loss, disruption, and operational issues.
Worm
A computer virus that doesn’t require a host file or program to replicate and spread, but can self-replicate in a device’s active memory and spread itself to other computers in the network. Worms may scan for weaknesses or security flaws in different services to spread.

Patching
Applying software or firmware updates to fix bugs and/or vulnerabilities and improve the functionality and/or security of a system.

Payload
The component of malware that executes the malicious activity, such as exfiltrating data or hijacking the system.

Phishing
Untargeted mass-send email campaigns designed to trick users into disclosing private and/or sensitive information (such as bank account details) or clicking malicious links. Malicious links can download malware onto the victim’s device to supplement other forms of data collection or cybercrime.

Spear-phishing
A form of targeted phishing where the email sender poses as a person the recipient knows and/or trusts, or includes information in the email known to interest the target.

Ransomware
See Malware.

RAT
A remote access tool or remote access Trojan is software that allows the user to remotely control a computer from a different location. RATs are used for both legitimate and malicious reasons, such as when hackers use RATs to unlawfully execute commands on another user’s computer.

Risk
An action or circumstance that could lead to the loss of, or damage to, data, hardware, or software.

Social engineering
Manipulating or psychologically tricking someone into disclosing personal or sensitive information or carrying out actions on behalf of the attacker.

Spam
Any unsolicited email that’s sent to large lists of recipients.

Spear-phishing
See Phishing.

Spoofing
IP spoofing involves sending modified data packets to a computer system with a forged IP address to pose as another, trusted computer system. IP spoofing allows cybercriminals to carry out attacks without detection.

Email spoofing involves modifying an email header to appear like it’s from a trusted source, such as the recipient’s bank.

Spyware
See Malware.

SQL injection
An SQL code injection technique where malicious SQL statements (i.e. pieces of text that serve as valid commands in a database) are inserted into entry fields in a data-driven application. For example, an attacker could instruct a vulnerable database to send them its entire contents.

Supply-chain attack
A cyber attack that finds and targets weak points in an organization’s supply chain to infiltrate that organization’s digital infrastructure.

Threat
Any danger that can exploit a bug, vulnerability, or security flaw in a computer system or network.

Trojan
See Malware.

Virus
See Malware.

Vulnerability
A weakness or flaw in a system that an attacker could exploit to gain unauthorized access to that system.

Zero-day vulnerability
A software vulnerability or bug that vendors or antivirus companies don’t know about, and that hackers may already be exploiting, or can exploit immediately. Named because the company has “zero days” to resolve the issue.

Worm
See Malware: Virus.

Encryption

Advanced Encryption Standard (AES)
A fast and highly secure, symmetric block cipher that can encrypt and decrypt data. AES encryption creates numerous keys using its initial key, each one making it more secure. The U.S. government has approved AES as the global standard for secure encryption.

Asymmetric/Public key encryption
See Encryption.

Cipher
The specific algorithm that can be used to encrypt or decrypt data.

Ciphertext
Encrypted, unreadable text.

Cryptography
A mathematical method to secure communication that converts a plaintext input into an unreadable output, and restores encrypted data to plaintext. Cryptography is used for confidentiality, data integrity, and data origin and entity authentication.

Decryption
The process of converting encrypted text into intelligible plaintext using the correct key.

Encipher/Encrypt
To algorithmically convert plaintext to cipher text.

Encryption
A mathematical function and security method that makes information unreadable to anyone except those with the correct key to decrypt (or decode) the information.

Asymmetric/Public key encryption
A cryptographic system that uses pairs of uniquely linked keys (a public key and a private key) to encrypt and decrypt information.
Symmetric encryption
A cryptographic system that uses a single, secret key to encrypt and decrypt data. Both the sender and receiver have a copy of the secret key in symmetric encryption, hence the name.

Hashing
A one-way cryptographic process in which a mathematical algorithm is applied to an input (i.e. a given key or string of characters) to produce a numeric output (“hash value”) that represents the data. Hashing is irreversible and creates a fixed-length value, or hash. The hash can be used to compare two files or pieces of text, such as two hashed passwords, without storing the original in a readable format.

HTTPS/TLS/SSL
HyperText Transfer Protocol Secure (HTTPS) is a secure way to send data between a web browser and a website. HTTPS encrypts communications using another protocol, Transport Layer Security (TLS), formerly known as Secure Sockets Layer (SSL). In HTTPS, any communication from the user is encrypted by a public key, and the web server’s private key decrypts this information.

Plaintext
Information that is unencrypted and readable without requiring a decryption key or device.

Private key
A confidential cryptographic key that enables an asymmetric cryptographic algorithm to decrypt data that was encrypted by its paired public key.

Public key
A publicly known cryptographic key that can enable an asymmetric cryptographic algorithm to encrypt data, which can only be decrypted by the corresponding private key.

Secret key
A cryptographic key that can enable symmetric key cryptography to both encrypt and decrypt data.

Symmetric encryption
See Encryption.

Triple Data Encryption Standard (3-DES)
A symmetric-key block cipher that encrypts each block of data three times. 3-DES is no longer considered secure and has been replaced by AES.

Government agencies and legislation

California Consumer Privacy Act (CCPA)
The CCPA is a data protection law that upholds and protects the privacy rights of Californian citizens and requires adequate data processing standards from organizations. The CCPA is based on GDPR and is similar with regards to its heavy focus on compliance and prevention.

Computer Emergency Response Team (CERT)
A group of information security professionals who detect, analyze, report, respond to, and prevent cybersecurity incidents. Many nations and regions have CERTs that respond to local incidents.

Federal Trade Commission (FTC)
The United States’ trading standards and consumer protection agency. The FTC prevents and punishes anticompetitive, deceptive, and unfair trade practices through law enforcement efforts and various other means. The FTC is often responsible for data protection issues in the US.

Federal Trade Commission Act (FTC Act)
The FTC Act is the primary statute enforced by the FTC that defines unfair and deceptive commercial acts or practices, and empowers the agency to investigate and enforce the FTC Act through monetary sanctions and other punishments.

General Data Protection Regulation (GDPR)
The European Union’s data protection regulation that outlines strict data security and privacy requirements for organizations based within the EU or trading with EU citizens.

Health Insurance Portability and Accountability Act (HIPAA)
A US federal law that protects the privacy and security of US citizens’ sensitive patient health information. HIPAA outlines nationwide standards for the use, disclosure, and safeguarding of health data, and gives patients the right to control their health information, while ensuring the proper flow of health data needed to protect public health and well-being.

Information Commissioner’s Office (ICO)
The ICO is a non-departmental public body that’s responsible for data protection in the United Kingdom. The ICO protects the information rights of British citizens and enforces compliance with UK data security and privacy laws. Namely, the Data Protection Act (DPA), which is the UK’s implementation of GDPR.

Office of the Privacy Commissioner of Canada (OCC)
The OCC is Canada’s dedicated data protection regulator. The OCC upholds, promotes, and protects the privacy rights of Canadian citizens and enforces compliance with Canadian data security and privacy laws.

The Bottom Line

With data breaches reaching their highest ever levels in 2021, you should be aware of data security, privacy, and the threats to your personal information.

We hope you have a clearer understanding of commonly used terms, and a better idea of those more nuanced cybersecurity definitions. These data leak and data breach terms should come in especially handy if you read our data breach reports.

Use this as a call-back resource and share it with any interested friends. It’s important to have at least a basic understanding of cybersecurity to keep our data, and ourselves, safe and secure online.

We review vendors based on rigorous testing and research but also take into account your feedback and our affiliate commission with providers. Some providers are owned by our parent company.

About the Author

Tom Read Managing Editor

Tom Read is a Content Manager for vpnMentor who loves to research, edit, and write informative articles about privacy and cybersecurity. His work includes a guide to supporting BLM safely online and a statistical breakdown of crime in US schools.

Follow our experts:

Did you like this article? Rate it!

I hated it! I don't really like it It was ok Pretty good! Loved it!

out of 10 - Voted by users

Thank you for your feedback

Please, comment on how to improve this article. Your feedback matters!

This field must contain more than 50 characters

The field content should not exceed 1000 letters

Sorry, links are not allowed in this field!

Name should contain at least 3 letters

The field content should not exceed 80 letters

Sorry, links are not allowed in this field!

Please enter a valid email address

Notify me of any updates on my contribution

Data Breach and Leak Glossary: Every Technical Term Explained

Key definitions

Data storage and cloud infrastructure

Threats, risks, and impacts

Encryption

Government agencies and legislation

The Bottom Line

About the Author

Leave a comment

Thanks for submitting a comment, %%name%%!

Thanks for your feedback!

We appreciate your support!