Glossary: Key Terms and Definitions
A
Action (in robotics): A task or behavior performed by a robot, often associated with ROS 2 action servers and clients for long-running tasks with feedback.
Affordance: The possibility of an action that is perceivable by an agent in the environment; e.g., a handle affords grasping.
AI (Artificial Intelligence): The simulation of human intelligence processes by machines, especially computer systems, including learning, reasoning, and self-correction.
APC (Average Precision at Confidence): A metric used to evaluate object detection models by measuring precision at different confidence thresholds.
API (Application Programming Interface): A set of rules and protocols for building and interacting with software applications.
APT (Action Point Tracking): A method for tracking and executing specific action points in robotic manipulation.
Autonomous System: A system that can operate independently without human intervention, making decisions based on its sensors and programming.
B
Balance Control: The ability of a robot to maintain its center of mass within its support polygon to avoid falling.
Bipedal Locomotion: The act of walking on two legs, a key challenge in humanoid robotics.
Blackwell Model: A model of computation that allows for infinite computations in finite time, though not applicable to robotics.
Bounded Rationality: The idea that decision-making is limited by the information available, the cognitive limitations of the mind, and the time available to make decisions.
Bounding Box: A rectangular box used in computer vision to identify the location of an object in an image.
C
Camera Matrix: A 3x3 matrix that describes the mapping between 3D world coordinates and 2D image coordinates.
Cartesian Space: The 3D space defined by X, Y, and Z coordinates used to describe positions and orientations in the physical world.
Centroidal Dynamics: The study of the motion of a robot's center of mass and its angular momentum about that point.
Cognitive Architecture: The structural organization of an intelligent agent's mind, including its memory, reasoning, and learning components.
Command Governor: A system component that limits or modifies commands to ensure safety and feasibility.
Convolutional Neural Network (CNN): A class of deep neural networks commonly used in computer vision tasks.
CoP (Center of Pressure): The point where the total sum of pressure forces acts on a surface, important in balance control.
CoM (Center of Mass): The point where the total mass of a body may be assumed to be concentrated for the purpose of calculations.
Cross-Entropy Loss: A loss function commonly used in classification tasks in machine learning.
CUDA: A parallel computing platform and programming model developed by NVIDIA for general computing on GPUs.
D
Deep Learning: A subset of machine learning that uses neural networks with many layers to model complex patterns in data.
DenseSLAM: A SLAM system that creates dense 3D maps of the environment.
Depth Image: An image where each pixel represents the distance from the camera to the object at that pixel.
Dexterity: The skill and grace in performing tasks, especially with the hands, important for robotic manipulation.
Diffusion Model: A generative model that learns to generate data by reversing a gradual noising process.
Distributed Intelligence: Intelligence that emerges from the interaction of multiple agents or components rather than being centralized.
Domain Randomization: A technique for training models in simulation by randomizing environment parameters to improve transfer to the real world.
Dynamics: The study of forces and torques and their effect on motion, as opposed to kinematics which studies motion without reference to forces.
E
Embodied AI: Artificial intelligence that interacts with the physical world through a physical body or robot.
Embodied Cognition: The theory that cognitive processes are deeply rooted in the body's interactions with the world.
Encoder: A device that measures the position or speed of a rotating shaft, commonly used in robotics for joint position feedback.
End-Effector: The tool or device at the end of a robotic arm designed to interact with the environment.
Episodic Memory: The memory of autobiographical events that can be explicitly stored and recalled, relevant in developmental robotics.
Euclidean Distance: The straight-line distance between two points in Euclidean space.
Exteroception: Sensory information from the external environment, as opposed to interoception which comes from inside the body.
F
Fiducial Marker: A visual marker with a known geometry and identity used for camera pose estimation and localization.
Field of View (FoV): The extent of the observable world that is seen at any given moment by a camera or sensor.
Focal Length: The distance over which a lens or mirror brings parallel rays of light to focus, important in camera calibration.
Forward Kinematics: The process of calculating the position and orientation of a robot's end-effector based on its joint angles.
Forward Dynamics: The calculation of motion resulting from applied forces, important in physics simulation.
Foveated Vision: A type of vision that mimics human vision with high resolution in the center and lower resolution in the periphery.
Friction Cone: A mathematical representation of the friction forces that can be applied at a contact point between two objects.
G
Gait: The pattern of movement of the limbs in locomotion, particularly important in bipedal robots.
Gaussian Process: A non-parametric method used for regression and classification tasks in machine learning.
General-Purpose Robot: A robot designed to perform a wide variety of tasks rather than being specialized for a specific function.
Geometric Reasoning: The ability to reason about shapes, sizes, positions, and spatial relationships.
GPU (Graphics Processing Unit): A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device.
Ground Truth: The gold standard or accurate information used as a reference for evaluating the accuracy of other measurements or models.
Gyroscope: A device used for measuring or maintaining orientation and angular velocity.
H
Haar Cascade: A machine learning object detection method that uses Haar-like features to detect objects in images.
Heuristic: A technique designed for solving a problem more quickly when classic methods are too slow or for finding an approximate solution when classic methods fail to find any exact solution.
Humanoid Robot: A robot with human-like characteristics, especially in terms of appearance and behavior.
Hyperparameter: A parameter of a learning algorithm that is not learned from data but set before the learning process begins.
I
IMU (Inertial Measurement Unit): A device that measures and reports a body's specific force, angular rate, and sometimes the magnetic field surrounding the body.
Inception Network: A type of convolutional neural network architecture that uses inception modules to improve performance.
Inference: The process of using a trained model to make predictions on new data.
Intrinsic Motivation: Motivation that comes from internal factors rather than external rewards, relevant in developmental robotics.
Inverse Kinematics: The process of determining the joint parameters that achieve a desired position of the end-effector.
Isaac ROS: NVIDIA's collection of hardware-accelerated, perception-focused packages designed to seamlessly integrate NVIDIA's AI and robotics technologies with ROS 2.
Isaac Sim: NVIDIA's reference simulation application and synthetic data generation tool for robotics.
J
Jacobi Matrix: A matrix of all first-order partial derivatives of a vector-valued function, used in robotics for relating joint velocities to end-effector velocities.
Joint Space: The space defined by the joint angles of a robot, as opposed to Cartesian space.
Jumping Conclusions: A cognitive bias where one makes hasty decisions without sufficient evidence, not directly relevant to robotics.
K
Kalman Filter: An algorithm that uses a series of measurements observed over time to estimate unknown variables, widely used in robotics for state estimation.
Keras: An open-source neural-network library written in Python, often used for deep learning applications.
Kinematics: The study of motion without considering the forces that cause the motion.
Kinesthetic Teaching: Teaching by physical demonstration where the teacher guides the learner's movements.
L
Lagrangian Dynamics: A reformulation of classical mechanics that describes the evolution of a physical system in terms of kinetic and potential energies.
Lander Model: A model of a spacecraft for landing, not relevant to general robotics.
Language Model: A probability distribution over sequences of words, used in natural language processing.
Latent Space: A representation space in machine learning where data points are represented by vectors that capture the essential features of the data.
Leaky Integrate-and-Fire: A model of neuronal activity, not directly relevant to robotics.
Legitimate Peripheral Participation: A concept from social learning theory, not directly relevant to robotics.
LiDAR (Light Detection and Ranging): A remote sensing method that uses light in the form of a pulsed laser to measure distances.
Linearization: The process of finding a linear approximation to a function at a given point, used in control systems.
Logit: The log-odds function, used in logistic regression and neural networks.
Loss Function: A function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event.
M
Manipulability Ellipsoid: A geometric representation of a robot manipulator's dexterity at a particular configuration.
Manipulator: A robot arm designed to manipulate objects in the environment.
Map: A representation of the environment used by a robot for navigation and planning.
Mask R-CNN: A deep neural network architecture for object detection, segmentation, and recognition.
Matrix Exponential: A matrix function that extends the scalar exponential function to matrices, used in robotics for rotation representations.
Max Pooling: A pooling operation in neural networks that takes the maximum value from a set of values in a defined window.
Mean Average Precision (mAP): A metric used to evaluate object detection models.
Micro-Batching: Processing multiple samples together in small batches to improve computational efficiency.
Mimetic Faculty: A philosophical concept about imitation and learning, not directly relevant to robotics.
Minimax: A decision rule used in decision theory, game theory, statistics, and philosophy for minimizing the possible loss for a worst case scenario.
Mirror Neuron: A neuron that fires both when an animal acts and when the animal observes the same action performed by another, not directly relevant to robotics.
Model Predictive Control (MPC): An advanced method of process control that uses a model of the system to predict future behavior.
Monte Carlo Method: A broad class of computational algorithms that rely on repeated random sampling to obtain numerical results.
Motion Primitives: Basic building blocks of movement that can be combined to create complex motions.
Motor Babbling: Random motor exploration used by infants to learn about their bodies, relevant in developmental robotics.
Multi-Agent System: A system composed of multiple interacting intelligent agents.
N
Naive Bayes: A family of simple probabilistic classifiers based on applying Bayes' theorem with strong independence assumptions between the features.
Navigation Mesh: A data structure and algorithm used in pathfinding to describe the walkable areas of an environment.
Neural Architecture Search (NAS): The process of automating the design of artificial neural networks.
Neuromorphic Computing: Computing architecture that mimics the neural structure of the human brain.
Newton's Method: An iterative method for finding successively better approximations to the roots of a real-valued function.
Non-Holonomic Constraint: A constraint that cannot be integrated to form a constraint on the configuration space, common in wheeled robots.
Normalized Device Coordinate: A coordinate system used in computer graphics where the visible region is mapped to a standard range.
Nyquist Rate: The minimum rate at which a signal should be sampled to accurately reconstruct the original signal.
O
Object Detection: A computer vision technique for identifying and locating objects within an image or video.
Odometry: The use of data from motion sensors to estimate change in position over time.
Omnidirectional Vision: A type of vision system that can see in all directions simultaneously.
Ontology: A formal representation of knowledge as a set of concepts within a domain and the relationships between those concepts.
OpenCV: An open-source computer vision and machine learning software library.
Optical Flow: The pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene.
Optimization: The selection of the best element from some set of available alternatives, often used in robotics for path planning and control.
Ordinal Optimization: A method for optimization that focuses on comparing and ranking alternatives rather than measuring exact values.
Orthogonal: Perpendicular or independent, used in mathematics and robotics to describe relationships between different dimensions or variables.
Oscillator: A device or system that produces periodic oscillations, used in robotics for generating rhythmic movements.
P
Path Planning: The computational problem of finding a path from a start point to a goal point while avoiding obstacles.
PCA (Principal Component Analysis): A statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables.
PDDL (Planning Domain Definition Language): A formal language for describing planning domains in artificial intelligence.
Perception-Action Coupling: The tight integration between perception and action in embodied systems.
Perceptual Crossing: A phenomenon in embodied cognitive science where perception and action become coupled.
Perspective Projection: The way three-dimensional objects appear to the eye on a two-dimensional surface.
Phase Portraits: A geometric representation of the trajectories of a dynamical system in the phase plane.
PID Controller: A control loop feedback mechanism widely used in industrial control systems.
Point Cloud: A set of data points in space, typically representing the external surface of an object.
Polar Decomposition: A factorization of a matrix into a product of a unitary matrix and a positive-semidefinite Hermitian matrix.
Pose: The position and orientation of an object in space.
Prepared Mind: A concept from scientific discovery about being ready to recognize important findings, not directly relevant to robotics.
Primal-Dual: Methods in optimization that consider both primal and dual formulations of an optimization problem.
Proactive Control: Control strategies that anticipate and prepare for future events rather than just reacting to current events.
Probabilistic Robotics: The application of probabilistic methods to problems in robotics, particularly in dealing with uncertainty.
Projection Matrix: A matrix that projects points from a higher-dimensional space to a lower-dimensional space.
Proprioception: The sense of the relative position of one's own parts of the body and strength of effort being employed in movement.
Q
Q-Learning: A model-free reinforcement learning algorithm to learn quality of actions telling an agent what action to take under what circumstances.
Quaternion: A number system that extends the complex numbers, commonly used in robotics for representing rotations.
R
RANSAC (Random Sample Consensus): An iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers.
Reactive Control: Control strategies that respond directly to sensory input without planning or prediction.
Receptive Field: The region of space in which stimuli will alter the firing of a neuron, important in neural networks and vision systems.
ReLU (Rectified Linear Unit): An activation function defined as the positive part of its argument.
Reinforcement Learning: A type of machine learning where an agent learns to make decisions by performing actions and receiving rewards or penalties.
ResNet (Residual Network): A deep convolutional neural network architecture that uses skip connections to address the vanishing gradient problem.
RGB-D: Color (Red, Green, Blue) and Depth information combined in a single data stream.
Rigid Body Dynamics: The computation of the motion of systems of interconnected rigid bodies under the action of external forces.
Robotics Middleware: Software that provides standard services and capabilities for robotic applications, like ROS 2.
ROS (Robot Operating System): A flexible framework for writing robot software, providing operating system-like functionality on a cluster of heterogeneous computers.
Rotation Matrix: A matrix used to perform a rotation in Euclidean space.
S
Saliency Map: A topographic map of the visual field that represents the likelihood that a given location will attract attention.
SARSA (State-Action-Reward-State-Action): A reinforcement learning algorithm that learns an optimal policy based on the current state, action, reward, next state, and next action.
Saturated Controller: A controller that operates at its limits, often used in robotics to prevent excessive actuator commands.
Scene Graph: A collection of nodes in a graph and the connections between them, used in computer graphics and simulation.
Semantic Segmentation: The task of associating each pixel in an image with a class label.
Sensor Fusion: The process of combining information from multiple sensors to achieve better accuracy and robustness than possible with a single sensor.
SGD (Stochastic Gradient Descent): An iterative method for optimizing an objective function with suitable smoothness properties.
Sigmoid Function: A mathematical function having an "S" shaped curve, commonly used as an activation function in neural networks.
SLAM (Simultaneous Localization and Mapping): The computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent's location within it.
Social Cognition: The study of how people process, store, and apply information about other people and social situations.
Spatial Reasoning: The process of thinking about objects in three dimensions and drawing conclusions about those objects.
State Estimation: The process of estimating the state of a system from noisy or incomplete measurements.
Stochastic Process: A collection of random variables representing the evolution of some random value over time.
Support Vector Machine (SVM): A supervised machine learning model used for classification and regression analysis.
Surprise Minimization: A principle in predictive coding theories, not directly relevant to robotics.
T
Tensor: A mathematical object that generalizes scalars, vectors, and matrices to higher dimensions, fundamental in deep learning.
TensorFlow: An open-source platform for machine learning developed by Google.
TF (Transforms): A package in ROS that keeps track of multiple coordinate frames over time in a tree structure.
TF-IDF (Term Frequency-Inverse Document Frequency): A numerical statistic used in information retrieval and text mining.
Torque: The tendency of a force to cause or change rotational motion of a body.
Trajectory: The path that a moving object follows through space as a function of time.
Transformer Architecture: A deep learning model architecture that uses attention mechanisms to weigh the importance of input data.
Tricycle Kinematics: The study of motion for vehicles with three wheels, relevant in mobile robotics.
Trolley Problem: A thought experiment in ethics, not directly relevant to robotics.
Truth Discovery: The process of identifying trustworthy information from conflicting sources, not directly relevant to robotics.
U
UKF (Unscented Kalman Filter): A variant of the Kalman filter designed for nonlinear systems.
Uncertainty Quantification: The science of quantitative characterization and reduction of uncertainties in both computational and real-world applications.
Unity: A cross-platform game engine that can be used for robotics simulation and visualization.
URDF (Unified Robot Description Format): An XML format for representing a robot model in ROS.
V
Vanishing Gradient: A problem that occurs during the training of deep neural networks where gradients become increasingly small as they propagate backward.
VGGNet: A convolutional neural network architecture known for its simplicity and depth.
ViT (Vision Transformer): A transformer-based architecture for image recognition tasks.
VLA (Vision-Language-Action): A system that integrates visual perception, language understanding, and physical action.
VO (Visual Odometry): The process of incrementally estimating the pose of a vehicle using visual data.
VR (Virtual Reality): A simulated experience that can be similar to or completely different from the real world.
VSLAM (Visual Simultaneous Localization and Mapping): SLAM that uses visual sensors as the primary input.
W
Wheel Odometry: The use of sensors to estimate the change in position of a wheeled robot based on wheel rotation.
Whole-Body Control: A control approach that considers the entire robot body when computing control actions.
World Coordinate System: A fixed reference frame used to describe the position and orientation of objects in the environment.
X
X-bar Theory: A theory of syntactic category formation, not relevant to robotics.
Xeniality: The quality of being friendly to strangers, not relevant to robotics.
Y
Yaw: The rotation of an object around its vertical axis.
Z
ZMP (Zero Moment Point): A point on the ground where the sum of all moments of the active forces equals zero, important in bipedal robotics.
Z-score: A statistical measurement that describes a value's relationship to the mean of a group of values.