Glossary: Key Terms and Definitions

A

Action (in robotics): A task or behavior performed by a robot, often associated with ROS 2 action servers and clients for long-running tasks with feedback.

Affordance: The possibility of an action that is perceivable by an agent in the environment; e.g., a handle affords grasping.

AI (Artificial Intelligence): The simulation of human intelligence processes by machines, especially computer systems, including learning, reasoning, and self-correction.

APC (Average Precision at Confidence): A metric used to evaluate object detection models by measuring precision at different confidence thresholds.

API (Application Programming Interface): A set of rules and protocols for building and interacting with software applications.

APT (Action Point Tracking): A method for tracking and executing specific action points in robotic manipulation.

Autonomous System: A system that can operate independently without human intervention, making decisions based on its sensors and programming.

B

Balance Control: The ability of a robot to maintain its center of mass within its support polygon to avoid falling.

Bipedal Locomotion: The act of walking on two legs, a key challenge in humanoid robotics.

Blackwell Model: A model of computation that allows for infinite computations in finite time, though not applicable to robotics.

Bounded Rationality: The idea that decision-making is limited by the information available, the cognitive limitations of the mind, and the time available to make decisions.

Bounding Box: A rectangular box used in computer vision to identify the location of an object in an image.

C

Camera Matrix: A 3x3 matrix that describes the mapping between 3D world coordinates and 2D image coordinates.

Cartesian Space: The 3D space defined by X, Y, and Z coordinates used to describe positions and orientations in the physical world.

Centroidal Dynamics: The study of the motion of a robot's center of mass and its angular momentum about that point.

Cognitive Architecture: The structural organization of an intelligent agent's mind, including its memory, reasoning, and learning components.

Command Governor: A system component that limits or modifies commands to ensure safety and feasibility.

Convolutional Neural Network (CNN): A class of deep neural networks commonly used in computer vision tasks.

CoP (Center of Pressure): The point where the total sum of pressure forces acts on a surface, important in balance control.

CoM (Center of Mass): The point where the total mass of a body may be assumed to be concentrated for the purpose of calculations.

Cross-Entropy Loss: A loss function commonly used in classification tasks in machine learning.

CUDA: A parallel computing platform and programming model developed by NVIDIA for general computing on GPUs.

D

Deep Learning: A subset of machine learning that uses neural networks with many layers to model complex patterns in data.

DenseSLAM: A SLAM system that creates dense 3D maps of the environment.

Depth Image: An image where each pixel represents the distance from the camera to the object at that pixel.

Dexterity: The skill and grace in performing tasks, especially with the hands, important for robotic manipulation.

Diffusion Model: A generative model that learns to generate data by reversing a gradual noising process.

Distributed Intelligence: Intelligence that emerges from the interaction of multiple agents or components rather than being centralized.

Domain Randomization: A technique for training models in simulation by randomizing environment parameters to improve transfer to the real world.

Dynamics: The study of forces and torques and their effect on motion, as opposed to kinematics which studies motion without reference to forces.

E

Embodied AI: Artificial intelligence that interacts with the physical world through a physical body or robot.

Embodied Cognition: The theory that cognitive processes are deeply rooted in the body's interactions with the world.

Encoder: A device that measures the position or speed of a rotating shaft, commonly used in robotics for joint position feedback.

End-Effector: The tool or device at the end of a robotic arm designed to interact with the environment.

Episodic Memory: The memory of autobiographical events that can be explicitly stored and recalled, relevant in developmental robotics.

Euclidean Distance: The straight-line distance between two points in Euclidean space.

Exteroception: Sensory information from the external environment, as opposed to interoception which comes from inside the body.

F

Fiducial Marker: A visual marker with a known geometry and identity used for camera pose estimation and localization.

Field of View (FoV): The extent of the observable world that is seen at any given moment by a camera or sensor.

Focal Length: The distance over which a lens or mirror brings parallel rays of light to focus, important in camera calibration.

Forward Kinematics: The process of calculating the position and orientation of a robot's end-effector based on its joint angles.

Forward Dynamics: The calculation of motion resulting from applied forces, important in physics simulation.

Foveated Vision: A type of vision that mimics human vision with high resolution in the center and lower resolution in the periphery.

Friction Cone: A mathematical representation of the friction forces that can be applied at a contact point between two objects.

G

Gait: The pattern of movement of the limbs in locomotion, particularly important in bipedal robots.

Gaussian Process: A non-parametric method used for regression and classification tasks in machine learning.

General-Purpose Robot: A robot designed to perform a wide variety of tasks rather than being specialized for a specific function.

Geometric Reasoning: The ability to reason about shapes, sizes, positions, and spatial relationships.

GPU (Graphics Processing Unit): A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device.

Ground Truth: The gold standard or accurate information used as a reference for evaluating the accuracy of other measurements or models.

Gyroscope: A device used for measuring or maintaining orientation and angular velocity.

H

Haar Cascade: A machine learning object detection method that uses Haar-like features to detect objects in images.

Heuristic: A technique designed for solving a problem more quickly when classic methods are too slow or for finding an approximate solution when classic methods fail to find any exact solution.

Humanoid Robot: A robot with human-like characteristics, especially in terms of appearance and behavior.

Hyperparameter: A parameter of a learning algorithm that is not learned from data but set before the learning process begins.

I

IMU (Inertial Measurement Unit): A device that measures and reports a body's specific force, angular rate, and sometimes the magnetic field surrounding the body.

Inception Network: A type of convolutional neural network architecture that uses inception modules to improve performance.

Inference: The process of using a trained model to make predictions on new data.

Intrinsic Motivation: Motivation that comes from internal factors rather than external rewards, relevant in developmental robotics.

Inverse Kinematics: The process of determining the joint parameters that achieve a desired position of the end-effector.

Isaac ROS: NVIDIA's collection of hardware-accelerated, perception-focused packages designed to seamlessly integrate NVIDIA's AI and robotics technologies with ROS 2.

Isaac Sim: NVIDIA's reference simulation application and synthetic data generation tool for robotics.

J

Jacobi Matrix: A matrix of all first-order partial derivatives of a vector-valued function, used in robotics for relating joint velocities to end-effector velocities.

Joint Space: The space defined by the joint angles of a robot, as opposed to Cartesian space.

Jumping Conclusions: A cognitive bias where one makes hasty decisions without sufficient evidence, not directly relevant to robotics.

K

Kalman Filter: An algorithm that uses a series of measurements observed over time to estimate unknown variables, widely used in robotics for state estimation.

Keras: An open-source neural-network library written in Python, often used for deep learning applications.

Kinematics: The study of motion without considering the forces that cause the motion.

Kinesthetic Teaching: Teaching by physical demonstration where the teacher guides the learner's movements.

L

Lagrangian Dynamics: A reformulation of classical mechanics that describes the evolution of a physical system in terms of kinetic and potential energies.

Lander Model: A model of a spacecraft for landing, not relevant to general robotics.

Language Model: A probability distribution over sequences of words, used in natural language processing.

Latent Space: A representation space in machine learning where data points are represented by vectors that capture the essential features of the data.

Leaky Integrate-and-Fire: A model of neuronal activity, not directly relevant to robotics.

Legitimate Peripheral Participation: A concept from social learning theory, not directly relevant to robotics.

LiDAR (Light Detection and Ranging): A remote sensing method that uses light in the form of a pulsed laser to measure distances.

Linearization: The process of finding a linear approximation to a function at a given point, used in control systems.

Logit: The log-odds function, used in logistic regression and neural networks.

Loss Function: A function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event.

M

Manipulability Ellipsoid: A geometric representation of a robot manipulator's dexterity at a particular configuration.

Manipulator: A robot arm designed to manipulate objects in the environment.

Map: A representation of the environment used by a robot for navigation and planning.

Mask R-CNN: A deep neural network architecture for object detection, segmentation, and recognition.

Matrix Exponential: A matrix function that extends the scalar exponential function to matrices, used in robotics for rotation representations.

Max Pooling: A pooling operation in neural networks that takes the maximum value from a set of values in a defined window.

Mean Average Precision (mAP): A metric used to evaluate object detection models.

Micro-Batching: Processing multiple samples together in small batches to improve computational efficiency.

Mimetic Faculty: A philosophical concept about imitation and learning, not directly relevant to robotics.

Minimax: A decision rule used in decision theory, game theory, statistics, and philosophy for minimizing the possible loss for a worst case scenario.

Mirror Neuron: A neuron that fires both when an animal acts and when the animal observes the same action performed by another, not directly relevant to robotics.

Model Predictive Control (MPC): An advanced method of process control that uses a model of the system to predict future behavior.

Monte Carlo Method: A broad class of computational algorithms that rely on repeated random sampling to obtain numerical results.

Motion Primitives: Basic building blocks of movement that can be combined to create complex motions.

Motor Babbling: Random motor exploration used by infants to learn about their bodies, relevant in developmental robotics.

Multi-Agent System: A system composed of multiple interacting intelligent agents.

N

Naive Bayes: A family of simple probabilistic classifiers based on applying Bayes' theorem with strong independence assumptions between the features.

Navigation Mesh: A data structure and algorithm used in pathfinding to describe the walkable areas of an environment.

Neural Architecture Search (NAS): The process of automating the design of artificial neural networks.

Neuromorphic Computing: Computing architecture that mimics the neural structure of the human brain.

Newton's Method: An iterative method for finding successively better approximations to the roots of a real-valued function.

Non-Holonomic Constraint: A constraint that cannot be integrated to form a constraint on the configuration space, common in wheeled robots.

Normalized Device Coordinate: A coordinate system used in computer graphics where the visible region is mapped to a standard range.

Nyquist Rate: The minimum rate at which a signal should be sampled to accurately reconstruct the original signal.

O

Object Detection: A computer vision technique for identifying and locating objects within an image or video.

Odometry: The use of data from motion sensors to estimate change in position over time.

Omnidirectional Vision: A type of vision system that can see in all directions simultaneously.

Ontology: A formal representation of knowledge as a set of concepts within a domain and the relationships between those concepts.

OpenCV: An open-source computer vision and machine learning software library.

Optical Flow: The pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene.

Optimization: The selection of the best element from some set of available alternatives, often used in robotics for path planning and control.

Ordinal Optimization: A method for optimization that focuses on comparing and ranking alternatives rather than measuring exact values.

Orthogonal: Perpendicular or independent, used in mathematics and robotics to describe relationships between different dimensions or variables.

Oscillator: A device or system that produces periodic oscillations, used in robotics for generating rhythmic movements.

P

Path Planning: The computational problem of finding a path from a start point to a goal point while avoiding obstacles.

PCA (Principal Component Analysis): A statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables.

PDDL (Planning Domain Definition Language): A formal language for describing planning domains in artificial intelligence.

Perception-Action Coupling: The tight integration between perception and action in embodied systems.

Perceptual Crossing: A phenomenon in embodied cognitive science where perception and action become coupled.

Perspective Projection: The way three-dimensional objects appear to the eye on a two-dimensional surface.

Phase Portraits: A geometric representation of the trajectories of a dynamical system in the phase plane.

PID Controller: A control loop feedback mechanism widely used in industrial control systems.

Point Cloud: A set of data points in space, typically representing the external surface of an object.

Polar Decomposition: A factorization of a matrix into a product of a unitary matrix and a positive-semidefinite Hermitian matrix.

Pose: The position and orientation of an object in space.

Prepared Mind: A concept from scientific discovery about being ready to recognize important findings, not directly relevant to robotics.

Primal-Dual: Methods in optimization that consider both primal and dual formulations of an optimization problem.

Proactive Control: Control strategies that anticipate and prepare for future events rather than just reacting to current events.

Probabilistic Robotics: The application of probabilistic methods to problems in robotics, particularly in dealing with uncertainty.

Projection Matrix: A matrix that projects points from a higher-dimensional space to a lower-dimensional space.

Proprioception: The sense of the relative position of one's own parts of the body and strength of effort being employed in movement.

Q

Q-Learning: A model-free reinforcement learning algorithm to learn quality of actions telling an agent what action to take under what circumstances.

Quaternion: A number system that extends the complex numbers, commonly used in robotics for representing rotations.

R

RANSAC (Random Sample Consensus): An iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers.

Reactive Control: Control strategies that respond directly to sensory input without planning or prediction.

Receptive Field: The region of space in which stimuli will alter the firing of a neuron, important in neural networks and vision systems.

ReLU (Rectified Linear Unit): An activation function defined as the positive part of its argument.

Reinforcement Learning: A type of machine learning where an agent learns to make decisions by performing actions and receiving rewards or penalties.

ResNet (Residual Network): A deep convolutional neural network architecture that uses skip connections to address the vanishing gradient problem.

RGB-D: Color (Red, Green, Blue) and Depth information combined in a single data stream.

Rigid Body Dynamics: The computation of the motion of systems of interconnected rigid bodies under the action of external forces.

Robotics Middleware: Software that provides standard services and capabilities for robotic applications, like ROS 2.

ROS (Robot Operating System): A flexible framework for writing robot software, providing operating system-like functionality on a cluster of heterogeneous computers.

Rotation Matrix: A matrix used to perform a rotation in Euclidean space.

S

Saliency Map: A topographic map of the visual field that represents the likelihood that a given location will attract attention.

SARSA (State-Action-Reward-State-Action): A reinforcement learning algorithm that learns an optimal policy based on the current state, action, reward, next state, and next action.

Saturated Controller: A controller that operates at its limits, often used in robotics to prevent excessive actuator commands.

Scene Graph: A collection of nodes in a graph and the connections between them, used in computer graphics and simulation.

Semantic Segmentation: The task of associating each pixel in an image with a class label.

Sensor Fusion: The process of combining information from multiple sensors to achieve better accuracy and robustness than possible with a single sensor.

SGD (Stochastic Gradient Descent): An iterative method for optimizing an objective function with suitable smoothness properties.

Sigmoid Function: A mathematical function having an "S" shaped curve, commonly used as an activation function in neural networks.

SLAM (Simultaneous Localization and Mapping): The computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent's location within it.

Social Cognition: The study of how people process, store, and apply information about other people and social situations.

Spatial Reasoning: The process of thinking about objects in three dimensions and drawing conclusions about those objects.

State Estimation: The process of estimating the state of a system from noisy or incomplete measurements.

Stochastic Process: A collection of random variables representing the evolution of some random value over time.

Support Vector Machine (SVM): A supervised machine learning model used for classification and regression analysis.

Surprise Minimization: A principle in predictive coding theories, not directly relevant to robotics.

T

Tensor: A mathematical object that generalizes scalars, vectors, and matrices to higher dimensions, fundamental in deep learning.

TensorFlow: An open-source platform for machine learning developed by Google.

TF (Transforms): A package in ROS that keeps track of multiple coordinate frames over time in a tree structure.

TF-IDF (Term Frequency-Inverse Document Frequency): A numerical statistic used in information retrieval and text mining.

Torque: The tendency of a force to cause or change rotational motion of a body.

Trajectory: The path that a moving object follows through space as a function of time.

Transformer Architecture: A deep learning model architecture that uses attention mechanisms to weigh the importance of input data.

Tricycle Kinematics: The study of motion for vehicles with three wheels, relevant in mobile robotics.

Trolley Problem: A thought experiment in ethics, not directly relevant to robotics.

Truth Discovery: The process of identifying trustworthy information from conflicting sources, not directly relevant to robotics.

U

UKF (Unscented Kalman Filter): A variant of the Kalman filter designed for nonlinear systems.

Uncertainty Quantification: The science of quantitative characterization and reduction of uncertainties in both computational and real-world applications.

Unity: A cross-platform game engine that can be used for robotics simulation and visualization.

URDF (Unified Robot Description Format): An XML format for representing a robot model in ROS.

V

Vanishing Gradient: A problem that occurs during the training of deep neural networks where gradients become increasingly small as they propagate backward.

VGGNet: A convolutional neural network architecture known for its simplicity and depth.

ViT (Vision Transformer): A transformer-based architecture for image recognition tasks.

VLA (Vision-Language-Action): A system that integrates visual perception, language understanding, and physical action.

VO (Visual Odometry): The process of incrementally estimating the pose of a vehicle using visual data.

VR (Virtual Reality): A simulated experience that can be similar to or completely different from the real world.

VSLAM (Visual Simultaneous Localization and Mapping): SLAM that uses visual sensors as the primary input.

W

Wheel Odometry: The use of sensors to estimate the change in position of a wheeled robot based on wheel rotation.

Whole-Body Control: A control approach that considers the entire robot body when computing control actions.

World Coordinate System: A fixed reference frame used to describe the position and orientation of objects in the environment.

X

X-bar Theory: A theory of syntactic category formation, not relevant to robotics.

Xeniality: The quality of being friendly to strangers, not relevant to robotics.

Y

Yaw: The rotation of an object around its vertical axis.

Z

ZMP (Zero Moment Point): A point on the ground where the sum of all moments of the active forces equals zero, important in bipedal robotics.

Z-score: A statistical measurement that describes a value's relationship to the mean of a group of values.