Glossary
Abstraction
The process of hiding complex details and showing only the essential features of a system or concept. It simplifies interaction by allowing users to focus on what something does rather than how it works.
Example:
When you click an icon to open an app, you are using abstraction because you don't need to know the thousands of lines of code running behind the scenes.
Analog Data
Data that is continuous and varies smoothly over time, representing real-world phenomena directly.
Example:
The sound waves produced by a guitar are a form of analog data before they are converted into a digital recording.
Analog vs. Digital
Analog data is continuous and infinitely variable, like a sound wave, while digital data is discrete and represented by distinct, finite values, typically bits.
Example:
A traditional vinyl record stores music as analog grooves, whereas an MP3 file stores it as digital bits.
Bias in Data
Systematic errors or prejudices introduced into a dataset, often due to the way data is collected, sampled, or processed, leading to skewed or unfair conclusions.
Example:
If a survey about internet usage is only given to people with smartphones, it could introduce bias in data by excluding those who primarily use public computers.
Binary Conversion
The process of translating numbers between the decimal (base-10) system, which humans commonly use, and the binary (base-2) system, which computers use.
Example:
To represent the number 10 in a computer, you would perform a binary conversion to get 1010.
Binary System
A base-2 number system that uses only two symbols, 0 and 1, to represent all numerical values.
Example:
Computers operate using the binary system, where all operations and data are represented using combinations of 0s and 1s.
Bits
The smallest unit of data in a computer, represented as either a 0 or a 1. All digital information is fundamentally stored using bits.
Example:
When you type the letter 'A' on your keyboard, the computer stores it as a specific sequence of bits, like 01000001.
Byte
A unit of digital information that most commonly consists of eight bits.
Example:
A single character, like the letter 'A', is typically stored as one byte of data.
Context Matters
The principle that the same sequence of bits can represent different types of data (e.g., a number, a character, or a color) depending on how the computer is programmed to interpret them.
Example:
The bit sequence '01000001' could be the number 65, the letter 'A', or a specific shade of green, illustrating how context matters for interpretation.
Data
Raw, unorganized facts, figures, or symbols that have no inherent meaning on their own until processed.
Example:
The temperature readings from a sensor every minute are raw data until they are analyzed to show a trend.
Data Analysis
The process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
Example:
A scientist performing data analysis on climate records might identify a trend of increasing global temperatures over decades.
Data Cleaning
The process of detecting and correcting (or removing) corrupt or inaccurate records from a dataset, making the data uniform and consistent without changing its meaning.
Example:
Before analyzing customer addresses, a company might perform data cleaning to standardize abbreviations like 'St.' to 'Street' and correct typos.
Data Compression
The process of reducing the number of bits required to store or transmit data, making files smaller and more efficient.
Example:
Zipping a folder of documents before emailing it is a form of data compression to reduce the attachment size.
Data Filtering
The process of selecting a subset of data based on specific criteria, allowing users to focus on relevant information and identify patterns.
Example:
When you search for emails from a specific sender in your inbox, you are performing data filtering to narrow down the results.
Data Manipulation
The process of changing, organizing, or transforming data to make it more useful or to reveal new patterns and insights.
Example:
Combining sales records from different regions into a single report is a form of data manipulation to get a holistic view.
Data Processing Programs
Software applications or scripts designed to manipulate, analyze, and extract insights from datasets.
Example:
A spreadsheet application like Excel or a statistical software package are examples of data processing programs used to organize and analyze numerical data.
Data Transformation
The process of converting data from one format or structure into another, often to make it compatible with other systems or to prepare it for analysis.
Example:
Converting a list of temperatures from Celsius to Fahrenheit is a data transformation that changes the values but not the underlying meaning.
Digital Data
Data that is discrete and represented by a finite set of values, typically binary digits (0s and 1s).
Example:
A photograph taken with a smartphone is stored as digital data, composed of individual pixels, each with a specific color value.
Hexadecimal
A base-16 number system that uses 16 distinct symbols (0-9 and A-F) to represent numbers, often used as a more compact way to represent binary data.
Example:
Color codes on websites, like #FF0000 for red, are often expressed in hexadecimal because it's shorter than writing out the full binary sequence.
Information
Meaningful insights, patterns, or knowledge derived from processed and organized data.
Example:
After analyzing raw sales figures (data), the conclusion that 'sales increased by 15% last quarter' is information.
Lossless Compression
A type of data compression that allows the original data to be perfectly reconstructed from the compressed data without any loss of information.
Example:
When you compress a text document using a ZIP file, it uses lossless compression to ensure every character remains exactly the same when unzipped.
Lossless Compression Algorithms
Specific computational methods or formulas used to perform lossless data compression, enabling perfect reconstruction of the original data.
Example:
The run-length encoding algorithm is one of many lossless compression algorithms that can be used to compress images with large areas of uniform color.
Lossy Compression
A type of data compression that reduces file size by permanently discarding some data, resulting in a smaller file but with some loss of original quality.
Example:
Saving a high-resolution photo as a small JPEG file often uses lossy compression, which removes some visual detail to achieve a much smaller file size.
Lossy Compression Algorithms
Specific computational methods or formulas used to perform lossy data compression, sacrificing some data for greater file size reduction.
Example:
The Discrete Cosine Transform (DCT) is a key component of many lossy compression algorithms used in JPEG images and MP3 audio files.
Machine Code
The low-level programming language consisting of binary instructions that a computer's central processing unit (CPU) can directly execute. It is the fundamental language computers understand.
Example:
A simple command like 'add two numbers' is translated by a compiler into complex machine code instructions before the computer can perform the operation.
Metadata
Data that provides information about other data, offering context, structure, and organization without changing the primary data itself.
Example:
The date a photo was taken, the camera model used, and the location where it was shot are all examples of metadata associated with an image file.
Metadata Changes
Modifications made to the descriptive information about data, such as changing a file's author or creation date, without altering the content of the data itself.
Example:
If you update the 'Artist' field for a song in your music library, you are making metadata changes without altering the actual audio file.
Number Base
The number of unique digits, including zero, used to represent numbers in a positional numeral system.
Example:
The decimal system uses a number base of 10 (digits 0-9), while the binary system uses a base of 2 (digits 0-1).
Overflow Error
An error that occurs when a calculation produces a result that is too large to be represented within the allocated number of bits or memory space.
Example:
If a calculator designed for 8-digit numbers tries to compute 99,999,999 + 1, it might result in an overflow error because the answer requires more digits than it can store.
Rounding Error
An error that occurs when a number cannot be represented exactly in a computer's finite number of bits, leading to a slight difference between the true value and its stored representation.
Example:
When you divide 1 by 3, the result is 0.333...; a computer might store it as 0.333333, leading to a tiny rounding error because it cannot store infinite precision.
Sampling Technique
The process of converting analog data into digital data by taking measurements at regular intervals.
Example:
When recording a song, a microphone captures continuous sound waves, and a sampling technique converts these waves into discrete digital points that can be stored on a computer.