Data Breach and Leak Glossary: Every Technical Term Explained
We’ve created a glossary of cybersecurity terms to help you better understand the technical subjects covered in our data breach reports and blog content.
We cover an array of subtopics, including (but not limited to) hacking, the biggest threats, and the government agencies responsible for protecting your information. But first, let’s take a look at some key definitions.
The ability to communicate or interact with a system, use its resources, handle its data, and/or control its components and functions. A computer accesses the internet through its web browser, modem, and internet service provider.
A set of security policies and processes that include authentication and authorization, and grant or deny requests to access company data and resources, or enter physical facilities. Access controls may ask an employee to use a password to authenticate their identity when signing into a computer, and also limit the files that employee is authorized to use.
The process or action of verifying the identity (or other attributes) of a user, process, or device.
1. Two-factor authentication (2FA)
Authentication that uses two separate processes to identify an individual. For example, a service might ask users to enter their password along with a code sent to their mobile device to gain access.
2. Multi-factor authentication (MFA)
Authentication that uses two or more separate processes to identify an individual.
The security process by which a server determines the level of access or privileges a user is granted to system resources (i.e. files, services, computer programs, data, and application features).
Advanced persistent threat
A malicious actor with significant resources and expertise that can levy multiple attack vectors (i.e. phishing, malware, insider threats) to achieve its goals.
A security program that detects and removes malicious software on a device or computer network.
- Attack surface
The sum of all possible points (attack vectors) in a system environment that an unauthorized user can attack to gain access to data.
- Attack vectors
A method of attack or pathway that a hacker uses to gain unauthorized access to a computer, system, or network and exploit security vulnerabilities. Malware, social engineering, text messages, and web pages are all examples of attack vectors.
- Cyber attack
Malicious efforts to damage, disturb or gain unauthorized access to a computer network or system through the use of one or more computers.
- See also:
Brute force attack
Denial of service (DoS)
Distributed denial of service (DDos)
A malicious individual, group, organization, or government that exploits a computer system or network to alter, destroy, steal, or damage its data, or disrupt its operation, and benefit from the outcome.
See Access control.
See Access control.
A unique and digitally signed file that acts as an electronic password, identifying an individual or organization and allowing the secure exchange of data online. Websites and applications are given security certificates by trusted third-party certification authorities, including IdenTrust, DigiCert, and Sectigo.
A user’s authentication details that are needed to verify their identity. Could include their password, username, token, or certificate.
A person who commits crimes by using or targeting a computer, computer network, or networked device.
The breach of a system’s security policy in an attempt to either use, control, or gain unauthorized access to that system or its data, or affect its integrity or availability in some way.
The protection of electronic devices, services, or networks, and their data, from unauthorized access, theft, or damage.
An organized repository of structured data or information, often held in a computer system and accessible in different ways.
- Relational database
Databases storing information in tables that can be linked (i.e., tables that can be related) to one another based on common data. This means a single query could produce a new table of data, allowing for greater insights and better informed decision making.
- SQL database
Also known as an SQL server database. An SQL database is a type of relational database built using SQL. It’s the programming language most used for databases largely due to its ability to retrieve multiple records easily.
A security incident in which data, systems, or networks are accessed or affected in an unauthorized manner.
Note: Some people use the terms data breach and data leak synonymously to mean any incidence of exposed data.
A security incident in which sensitive or personal data is accidentally disclosed physically, online, or in another public environment due to poor security practices or storage misconfiguration. A data leak could occur when someone loses a hard drive or leaves a database unsecured.
Note: Some people use the terms data breach and data leak synonymously to mean any incidence of exposed data.
The unintentional or accidental deletion, corruption, or exposure of data, or the act of forgetting where data is stored.
See Data protection.
The process and practices involved in protecting critical information from corruption, compromise, or loss, and restoring data to a functional state should it become inaccessible or unusable. Data privacy and data security are both components of data protection.
- Data privacy
The responsible governance and proper handling of an entity’s personal, sensitive, and/or confidential data, and the right of individuals to control how their personal data is collected and used.
- Data security
The defense of digital information from malicious or accidental threats that could misuse, expose, delete, corrupt, or provide unauthorized access to data. Includes data security technologies and methodologies, such as access control.
See Data protection.
A computer, software program, or platform used for database services, such as information storage, processing, and security.
The act of intentionally stealing information.
A trail of data about a user that is left behind as a result of their online activity.
The state of being unprotected and allowing access to data or capabilities that an attacker could use to gain unauthorized access to a system, network, and/or data.
Hardware or software that limits or controls traffic to or from a network based on predetermined rules. Firewalls are designed to prevent unauthorized access to or from a network.
- Black hat hacker
Hackers who exclusively work to cause damage for personal gain. They are unconcerned with laws and unbounded by ethical codes, and use their technical skills to bypass security protocols, conduct cyber espionage, or employ malware, usually for profit. They engage in activities like selling data breaches on the dark web, encrypting data and holding it for ransom, or stealing information to extort money from their victims.
- Gray hat hacker
Hackers who have neither the ethical code of white hat hackers nor the malicious intent of black hat hackers. Gray hat hackers use their technical knowledge to find and explore a system or network’s vulnerabilities, acting without the owner’s permission in ways that many would say is unethical and that is sometimes illegal. Unlike black hat hackers, they don’t exploit the vulnerabilities they find for profit. Instead, Gray hat hackers are likely to report vulnerabilities to system/network owners at first and may offer to repair vulnerabilities for a fee. They might escalate to full public disclosure if the exploit remains unfixed.
Hackers who use their technical skills in displays of social, ideological, religious, or political messaging. They are usually unconcerned with the ethics of white hat hackers, and are often willing to break laws for the sake of what they see as a greater good. Hacktivists may deface or immobilize websites or steal and release data to achieve their aim.
- Nation-state hacker
A hacker or hacking group that works on behalf of a government to disrupt and compromise, or access the critical data of, target governments, organizations, and/or individuals.
- White hat hacker
Also known as ethical hackers, white hat hackers may work in research labs or for reputable companies where strict ethical guidelines are followed. Their work may include performing penetration tests or engaging in bug-bounty programs where their intention isn’t to cause harm, but to uncover weaknesses or vulnerabilities within a system or network so that they can be fixed.
Poorly configured or insecure security controls that place systems and data at risk. For example, a database that’s configured without password protection is misconfigured.
An incorrect or substandard configuration of security controls that places data at risk; the state of security controls being misconfigured.
Multi-factor authentication (MFA)
See Access control.
A method of evaluating the cybersecurity of a network and/or system by intentionally attacking the network/system to uncover vulnerabilities, before the assessors relay this information to the network/system administrator.
(1) In the EU’s General Data Protection Regulation (GDPR), personal data is any information that relates to an identified or identifiable person, either directly (the data itself) or indirectly (when used with other data). This includes Personally identifiable information (PII), along with various other details such as photos and preferences. (2) In other jurisdictions, personal data simply means PII.
Sensitive personal data
In the EU’s GDPR, sensitive personal data encompasses the most critical forms of personal data, including health-related information, ethnic origin, political views, religious views, sexual preferences, etc. Sensitive personal data could cause the greatest harm if exposed, and GDPR protects it with stricter controls regarding its collection, processing, use, and storage.
Note: The definitions of Personal data (1) and Sensitive personal data are taken from the EU’s General Data Protection Regulation (GDPR). These terms may be interpreted differently in various jurisdictions around the world.
Personally identifiable information (PII)
Information that can identify an individual when used alone or alongside other relevant information. Includes data such as name, address, date of birth, social security number (SSN), and banking information.
A process that allows security researchers to safely and efficiently report vulnerabilities in applications and IT infrastructure to the relevant organizations or individuals.
A set of rules governing how an organization and employees properly access data, resources, and assets to minimize the risk of security threats and protect those assets.
Sensitive company data
Information that poses a risk to the company if exposed to another company or the public. Includes intellectual properties, trade secrets, business plans, and more.
Sensitive personal data
See Personal data.
A computer or computer program that shares data, resources, programs, or services to another computer and its user (i.e. the client).
Structured query language (SQL) code
A programming language, characterized by simple, directive statements that helps users operate relational databases and manage stored information. SQL ensures fast, easy, and accessible insertion, deletion, or retrieval of data.
Structured query language (SQL) database
Also known as an SQL server database. See Database.
A token is an object that represents the right to perform a specific action. Tokens can be either software or hardware. For example, authentication tokens can be hard tokens, which are physical security devices that unlock access to a resource or facility, or soft tokens, which are software on electronic devices (e.g. single-use authentication codes).
Two-factor authentication (2FA)
See Access control.
Any access to a network, system, application, data, or other resource that is not permitted and violates the security policy.
Not secure, safe, protected, or free from the risk of loss. A database is unsecured if it doesn’t adopt adequate security controls, such as password protection.
Data storage and cloud infrastructure
Amazon Web Services (AWS)
An Amazon subsidiary that provides remote cloud computing services and APIs (application programming interfaces). AWS services include cloud storage, computing power, and networking services.
Application programming interface (API)
A software intermediary that allows two different pieces of code to communicate with each other. APIs process any data transferred from one program to the other based on defined rules to deliver the request.
AWS S3 bucket
A container for objects or public cloud storage resources in Amazon Web Services’ Simple Storage Service (AWS S3). AWS S3 buckets and their data have to be secured by their owner with security controls like authentication and encryption – a method of making data unreadable to unauthorized personnel.
Azure Blob storage
A cloud storage solution, operated by Microsoft, that’s designed for keeping large stores of unstructured data, such as text or binary data.
A method of delivering computing services (i.e. networks, servers, storage, applications, and services) on-demand and over the internet.
A cloud computing model in which a third-party service operates data servers and provides remote data storage and access capabilities to individuals, companies, and/or organizations on-demand and over the internet. Service providers typically secure the physical environment of the servers, but the owners of the data may be required to manage their own digital security.
A collection of related data that can be managed as individual data points or a combination of data points.
A search engine that allows users to quickly store, search, and analyze large amounts of unstructured data – unprocessed information that’s stored in its native format. ElasticSearch is often used for storing real-time HTTP logs and software logs.
Google Cloud Storage
An online file storage service on the Google Cloud Platform that allows users to remotely store and access data. Use cases include data analytics, data backups, and media content storage and delivery.
An index refers to a list of data that helps the user query a database. Much like an index in the back of a book, a database’s index is a lookup table that shows users where to navigate for the information they require. Indexes are typically written in plaintext and may show groups of files, or a list of database entries.
A cloud computing platform, operated by Microsoft, that offers remote services for data analytics, data storage, networking, and more.
Information that’s been arranged into a formatted repository (usually a database) and adheres to a predefined data model. Structured datasets have a persistent order to facilitate efficient data processing and analysis. Structured data might include names, addresses, phone numbers, and credit card information, for example.
Data that isn't stored in a structured format. Unstructured datasets may have an internal structure that’s human or machine-generated, but this structure isn’t predefined by a data model. Unstructured data usually features formats that are not easy to store in a structured way, such as PDFs, images, and video files.
Virtual private network (VPN)
An encrypted network that creates a secure connection between the remote user and the internet. A VPN routes the user’s traffic through an encrypted server, making the connection unreadable to other internet users and masking the user’s IP address.
Threats, risks, and impacts
Software that automatically displays or downloads advertisements onto a user’s system.
A hidden entry point into a device or software that bypasses normal security measures, such as authentication. Developers might leave backdoors to help them quickly troubleshoot, fix issues, and regain control over applications or operating systems. Attackers can also exploit or create backdoors for themselves.
A computer program designed to automate tasks and repeatedly perform them over an extended duration is known as a bot. These bots typically mimic or supplant human actions but operate at a much higher speed. They can serve legitimate purposes, like customer service or web indexing, or they can be malicious, for instance, when bots are utilized to hijack computers for cryptocurrency mining and cybercriminal activities.
A network of electronic devices infected by malware that are used to carry out cyber attacks without their owner’s knowledge.
Brute force attack
An attack that uses computational power to input a large number of different value combinations. Attackers often use this method to find out passwords and access systems or accounts. Attackers may also brute force URLs on a website to gain unauthorized access to hidden pages.
- Distributed-denial-of-service (DDoS)
A denial-of-service attack that uses multiple devices to overload a server with traffic.
The act of revealing personally identifiable information (PII) about a person online without their permission. Doxing could include the disclosure of the victim’s name, home address, phone number, banking information, or other personal and private data or content.
Software or code designed to take advantage of a software vulnerability or security flaw in a system. Also refers to the act of attempting to breach a system’s security without authorization.
Using another person’s name, personal data, and other identifying characteristics to commit fraud. Identity thieves may apply for credit, file taxes, or purchase medical services in another person’s name.
A security risk originating with a person or group with authorized access to an organization's assets. The behavior may be accidental or malicious, but their misuse of access causes damage to the organization, its resources, data, personnel, facilities, equipment, networks, or systems.
Or keylogging. The process of using malware to record every pressed key on a user’s keyboard. Keystroke logging is commonly used to obtain users’ plaintext login credentials and credit card information.
Program code designed to execute unauthorized functions that will negatively impact the confidentiality, integrity, or availability of a system or data stored on a system.
Malware that blocks access to data or a computer system until the victim pays a fee.
Malware that gathers information about a person or organization without their knowledge and relays that information to another entity.
Malware or a virus that hides within legitimate software to infect the victim’s device.
A form of malware that can self-replicate once deployed inside a system to spread to other legitimate files, programs, and systems. Viruses can cause damage to host systems and can lead to data loss, disruption, and operational issues.
A computer virus that doesn’t require a host file or program to replicate and spread, but can self-replicate in a device’s active memory and spread itself to other computers in the network. Worms may scan for weaknesses or security flaws in different services to spread.
Applying software or firmware updates to fix bugs and/or vulnerabilities and improve the functionality and/or security of a system.
The component of malware that executes the malicious activity, such as exfiltrating data or hijacking the system.
Untargeted mass-send email campaigns designed to trick users into disclosing private and/or sensitive information (such as bank account details) or clicking malicious links. Malicious links can download malware onto the victim’s device to supplement other forms of data collection or cybercrime.
A form of targeted phishing where the email sender poses as a person the recipient knows and/or trusts, or includes information in the email known to interest the target.
A remote access tool or remote access Trojan is software that allows the user to remotely control a computer from a different location. RATs are used for both legitimate and malicious reasons, such as when hackers use RATs to unlawfully execute commands on another user’s computer.
An action or circumstance that could lead to the loss of, or damage to, data, hardware, or software.
IP spoofing involves sending modified data packets to a computer system with a forged IP address to pose as another, trusted computer system. IP spoofing allows cybercriminals to carry out attacks without detection.
Email spoofing involves modifying an email header to appear like it’s from a trusted source, such as the recipient’s bank.
An SQL code injection technique where malicious SQL statements (i.e. pieces of text that serve as valid commands in a database) are inserted into entry fields in a data-driven application. For example, an attacker could instruct a vulnerable database to send them its entire contents.
Any danger that can exploit a bug, vulnerability, or security flaw in a computer system or network.
A weakness or flaw in a system that an attacker could exploit to gain unauthorized access to that system.
- Zero-day vulnerability
A software vulnerability or bug that vendors or antivirus companies don’t know about, and that hackers may already be exploiting, or can exploit immediately. Named because the company has “zero days” to resolve the issue.
See Malware: Virus.
Advanced Encryption Standard (AES)
A fast and highly secure, symmetric block cipher that can encrypt and decrypt data. AES encryption creates numerous keys using its initial key, each one making it more secure. The U.S. government has approved AES as the global standard for secure encryption.
Asymmetric/Public key encryption
Encrypted, unreadable text.
A mathematical method to secure communication that converts a plaintext input into an unreadable output, and restores encrypted data to plaintext. Cryptography is used for confidentiality, data integrity, and data origin and entity authentication.
The process of converting encrypted text into intelligible plaintext using the correct key.
To algorithmically convert plaintext to cipher text.
- Asymmetric/Public key encryption
A cryptographic system that uses pairs of uniquely linked keys (a public key and a private key) to encrypt and decrypt information.
- Symmetric encryption
A cryptographic system that uses a single, secret key to encrypt and decrypt data. Both the sender and receiver have a copy of the secret key in symmetric encryption, hence the name.
A one-way cryptographic process in which a mathematical algorithm is applied to an input (i.e. a given key or string of characters) to produce a numeric output (“hash value”) that represents the data. Hashing is irreversible and creates a fixed-length value, or hash. The hash can be used to compare two files or pieces of text, such as two hashed passwords, without storing the original in a readable format.
HyperText Transfer Protocol Secure (HTTPS) is a secure way to send data between a web browser and a website. HTTPS encrypts communications using another protocol, Transport Layer Security (TLS), formerly known as Secure Sockets Layer (SSL). In HTTPS, any communication from the user is encrypted by a public key, and the web server’s private key decrypts this information.
A publicly known cryptographic key that can enable an asymmetric cryptographic algorithm to encrypt data, which can only be decrypted by the corresponding private key.
A cryptographic key that can enable symmetric key cryptography to both encrypt and decrypt data.
Triple Data Encryption Standard (3-DES)
A symmetric-key block cipher that encrypts each block of data three times. 3-DES is no longer considered secure and has been replaced by AES.
Government agencies and legislation
California Consumer Privacy Act (CCPA)
The CCPA is a data protection law that upholds and protects the privacy rights of Californian citizens and requires adequate data processing standards from organizations. The CCPA is based on GDPR and is similar with regards to its heavy focus on compliance and prevention.
Computer Emergency Response Team (CERT)
A group of information security professionals who detect, analyze, report, respond to, and prevent cybersecurity incidents. Many nations and regions have CERTs that respond to local incidents.
Federal Trade Commission (FTC)
The United States’ trading standards and consumer protection agency. The FTC prevents and punishes anticompetitive, deceptive, and unfair trade practices through law enforcement efforts and various other means. The FTC is often responsible for data protection issues in the US.
- Federal Trade Commission Act (FTC Act)
The FTC Act is the primary statute enforced by the FTC that defines unfair and deceptive commercial acts or practices, and empowers the agency to investigate and enforce the FTC Act through monetary sanctions and other punishments.
General Data Protection Regulation (GDPR)
The European Union’s data protection regulation that outlines strict data security and privacy requirements for organizations based within the EU or trading with EU citizens.
Health Insurance Portability and Accountability Act (HIPAA)
A US federal law that protects the privacy and security of US citizens’ sensitive patient health information. HIPAA outlines nationwide standards for the use, disclosure, and safeguarding of health data, and gives patients the right to control their health information, while ensuring the proper flow of health data needed to protect public health and well-being.
Information Commissioner’s Office (ICO)
The ICO is a non-departmental public body that’s responsible for data protection in the United Kingdom. The ICO protects the information rights of British citizens and enforces compliance with UK data security and privacy laws. Namely, the Data Protection Act (DPA), which is the UK’s implementation of GDPR.
Office of the Privacy Commissioner of Canada (OCC)
The OCC is Canada’s dedicated data protection regulator. The OCC upholds, promotes, and protects the privacy rights of Canadian citizens and enforces compliance with Canadian data security and privacy laws.
The Bottom Line
With data breaches reaching their highest ever levels in 2021, you should be aware of data security, privacy, and the threats to your personal information.
We hope you have a clearer understanding of commonly used terms, and a better idea of those more nuanced cybersecurity definitions. These data leak and data breach terms should come in especially handy if you read our data breach reports.
Use this as a call-back resource and share it with any interested friends. It’s important to have at least a basic understanding of cybersecurity to keep our data, and ourselves, safe and secure online.