Human Following Robot Using Arduino and Ultrasonic Sensor

Submitted by Gourav Tak on

Working of Human Following Robot Using Arduino

In recent years, robotics has witnessed significant advancements, enabling the creation of intelligent machines that can interact with the environment. One exciting application of robotics is the development of human-following robots. These robots can track and follow a person autonomously, making them useful in various scenarios like assistance in crowded areas, navigation support, or even as companions. In this article, we will explore in detail how to build a human following robot using Arduino and three ultrasonic sensors, complete with circuit diagrams and working code. Also, check all the Arduino-based Robotics projects by following the link.

The working of a human following robot using Arduino code and three ultrasonic sensors is an interesting project. What makes this project particularly interesting is the use of not just one, but three ultrasonic sensors. This adds a new dimension to the experience, as we typically see humans following a robot built with one ultrasonic, two IR, and one servo motor.  This servo motor has no role in the operation and also adds unnecessary complications. So I removed this servo and the IR sensors and used 3 ultrasonic sensors. With ultrasonic sensors, you can measure distance and use that information to navigate and follow a human target. Here’s a general outline of the steps involved in creating such a robot.

 

 

Components Needed for Human Following Robot Using Arduino

  • Arduino UNO board ×1

  • Ultrasonic sensor ×3

  • L298N motor driver ×1

  • Robot chassis

  • BO motors ×2

  • Wheels ×2

  • Li-ion battery 3.7V ×2

  • Battery holder ×1

  • Breadboard

  • Ultrasonic sensor holder ×3

  • Switch and jumper wires

Human Following Robot Using Arduino Circuit Diagram

Here is the schematic diagram of a Human-following robot circuit.

Arduino Human Following Robot Circuit Diagram

This design incorporates three ultrasonic sensors, allowing distance measurements in three directions front, right, and left. These sensors are connected to the Arduino board through their respective digital pins. Additionally, the circuit includes two DC motors for movement, which are connected to an L298N motor driver module. The motor driver module is, in turn, connected to the Arduino board using its corresponding digital pins. To power the entire setup, two 3.7V li-ion cells are employed, which are connected to the motor driver module via a switch.

Overall, this circuit diagram showcases the essential components and connections necessary for the Human-following robot to operate effectively.

arduino human following robot circuit

Circuit Connection:

Arduino and HC-SR04 Ultrasonic Sensor Module:

HC-SR04 Ultrasonic sensor Module

  • Connect the VCC pin of each ultrasonic sensor to the 5V pin on the Arduino board.

  • Connect the GND pin of each ultrasonic sensor to the GND pin on the Arduino board.

  • Connect the trigger pin (TRIG) of each ultrasonic sensor to separate digital pins (2,4, and 6) on the Arduino board.

  • Connect the echo pin (ECHO) of each ultrasonic to separate digital pins (3,5, and 7) on the Arduino board.

Arduino and Motor Driver Module:

  • Connect the digital output pins of the Arduino (digital pins 8, 9, 10, and 11) to the appropriate input pins (IN1, IN2, IN3, and IN4) on the motor driver module.

  • Connect the ENA and ENB pins of the motor driver module to the onboard High state pin with the help of a female header.

  • Connect the OUT1, OUT2, OUT3, and OUT4 pins of the motor driver module to the appropriate terminals of the motors.

  • Connect the VCC (+5V) and GND pins of the motor driver module to the appropriate power (Vin) and ground (GND) connections on the Arduino.

Power Supply:

  • Connect the positive terminal of the power supply to the +12V input of the motor driver module.

  • Connect the negative terminal of the power supply to the GND pin of the motor driver module.

  • Connect the GND pin of the Arduino to the GND pin of the motor driver module.

Human Following Robot Using Arduino Code

Here is a simple 3 Ultrasonic sensor-based Human following robot using Arduino Uno code that you can use for your project.

Ultrsonic Sensors on Robot

This code reads the distances from three ultrasonic sensors (‘frontDistance’, ‘leftDistance’, and ‘rightDistance’). It then compares these distances to determine the sensor with the smallest distance. If the smallest distance is below the threshold, it moves the car accordingly using the appropriate motor control function (‘moveForward()’, ‘turnLeft()’, ‘turnRight()’). If none of the distances are below the threshold, it stops the motor using ‘stop()’.

In this section, we define the pin connections for the ultrasonic sensors and motor control. The S1Trig, S2Trig, and S3Trig, variables represent the trigger pins of the three ultrasonic sensors, while S1Echo, S2Echo, and S3Echo, represent their respective echo pins.

The LEFT_MOTOR_PIN1, LEFT_MOTOR_PIN2, RIGHT_MOTOR_PIN1, and RIGHT_MOTOR_PIN2 variables define the pins for controlling the motors.

The MAX_DISTANCE and MIN_DISTANCE_BACK variables set the thresholds for obstacle detection.

// Ultrasonic sensor pins
#define S1Trig 2
#define S2Trig 4
#define S3Trig 6
#define S1Echo 3
#define S2Echo 5
#define S3Echo 7
// Motor control pins
#define LEFT_MOTOR_PIN1 8
#define LEFT_MOTOR_PIN2 9
#define RIGHT_MOTOR_PIN1 10
#define RIGHT_MOTOR_PIN2 11
// Distance thresholds for obstacle detection
#define MAX_DISTANCE 40
#define MIN_DISTANCE_BACK 5

Make sure to adjust the values of ‘MIN_DISTANCE_BACK’ and ‘MAX_DISTANCE’ according to your specific requirements and the characteristics of your robot.

The suitable values for ‘MIN_DISTANCE_BACK’ and ‘MAX_DISTANCE’ depend on the specific requirements and characteristics of your human-following robot. You will need to consider factors such as the speed of your robot, the response time of the sensors, and the desired safety margin

Here are some general guidelines to help you choose suitable values.

MIN_DISTANCE_BACK’ This value represents the distance at which the car should come to a stop when an obstacle or hand is detected directly in front. It should be set to a distance that allows the car to back safely without colliding with the obstacle or hand. A typical value could be around 5-10 cm.

MAX_DISTANCE’ This value represents the maximum distance at which the car considers the path ahead to be clear and can continue moving forward. It should be set to a distance that provides enough room for the car to move without colliding with any obstacles or hands. If your hand and obstacles are going out of this range, the robot should be stop. A typical value could be around 30-50 cm.

These values are just suggestions, and you may need to adjust them based on the specific characteristics of your robot and the environment in which it operates.

These lines set the motor speed limits. ‘MAX_SPEED’ denotes the upper limit for motor speed, while ‘MIN_SPEED’ is a lower value used for a slight left bias. The speed values are typically within the range of 0 to 255, and can be adjusted to suit our specific requirements.

// Maximum and minimum motor speeds
#define MAX_SPEED 150
#define MIN_SPEED 75

The ‘setup()’ function is called once at the start of the program. In the setup() function, we set the motor control pins (LEFT_MOTOR_PIN1, LEFT_MOTOR_PIN2, RIGHT_MOTOR_PIN1, RIGHT_MOTOR_PIN2) as output pins using ‘pinMode()’ . We also set the trigger pins (S1Trig, S2Trig, S3Trig) of the ultrasonic sensors as output pins and the echo pins (S1Echo, S2Echo, S3Echo) as input pins. Lastly, we initialize the serial communication at a baud rate of 9600 for debugging purposes.

void setup() {
  // Set motor control pins as outputs
  pinMode(LEFT_MOTOR_PIN1, OUTPUT);
  pinMode(LEFT_MOTOR_PIN2, OUTPUT);
  pinMode(RIGHT_MOTOR_PIN1, OUTPUT);
  pinMode(RIGHT_MOTOR_PIN2, OUTPUT);
  //Set the Trig pins as output pins
  pinMode(S1Trig, OUTPUT);
  pinMode(S2Trig, OUTPUT);
  pinMode(S3Trig, OUTPUT);
  //Set the Echo pins as input pins
  pinMode(S1Echo, INPUT);
  pinMode(S2Echo, INPUT);
  pinMode(S3Echo, INPUT);
  // Initialize the serial communication for debugging
  Serial.begin(9600);
}

This block of code consists of three functions (‘sensorOne()’, ‘sensorTwo()’, ‘sensorThree()’) responsible for measuring the distance using ultrasonic sensors.

The ‘sensorOne()’ function measures the distance using the first ultrasonic sensor. It's important to note that the conversion of the pulse duration to distance is based on the assumption that the speed of sound is approximately 343 meters per second. Dividing by 29 and halving the result provides an approximate conversion from microseconds to centimeters.

The ‘sensorTwo()’ and ‘sensorThree()’ functions work similarly, but for the second and third ultrasonic sensors, respectively.

// Function to measure the distance using an ultrasonic sensor
int sensorOne() {
  //pulse output
  digitalWrite(S1Trig, LOW);
  delayMicroseconds(2);
  digitalWrite(S1Trig, HIGH);
  delayMicroseconds(10);
  digitalWrite(S1Trig, LOW);
  long t = pulseIn(S1Echo, HIGH);//Get the pulse
  int cm = t / 29 / 2; //Convert time to the distance
  return cm; // Return the values from the sensor
}
//Get the sensor values
int sensorTwo() {
  //pulse output
  digitalWrite(S2Trig, LOW);
  delayMicroseconds(2);
  digitalWrite(S2Trig, HIGH);
  delayMicroseconds(10);
  digitalWrite(S2Trig, LOW);
  long t = pulseIn(S2Echo, HIGH);//Get the pulse
  int cm = t / 29 / 2; //Convert time to the distance
  return cm; // Return the values from the sensor
}
//Get the sensor values
int sensorThree() {
  //pulse output
  digitalWrite(S3Trig, LOW);
  delayMicroseconds(2);
  digitalWrite(S3Trig, HIGH);
  delayMicroseconds(10);
  digitalWrite(S3Trig, LOW);
  long t = pulseIn(S3Echo, HIGH);//Get the pulse
  int cm = t / 29 / 2; //Convert time to the distance
  return cm; // Return the values from the sensor
}

In this section, the ‘loop()’ function begins by calling the ‘sensorOne()’, ‘sensorTwo()’, and ‘sensorThree()’ functions to measure the distances from the ultrasonic sensors. The distances are then stored in the variables ‘frontDistance’, ‘leftDistance’, and ‘rightDistance’.

Next, the code utilizes the ‘Serial’ object to print the distance values to the serial monitor for debugging and monitoring purposes.

void loop() {
  int frontDistance = sensorOne();
  int leftDistance = sensorTwo();
  int rightDistance = sensorThree();
  Serial.print("Front: ");
  Serial.print(frontDistance);
  Serial.print(" cm, Left: ");
  Serial.print(leftDistance);
  Serial.print(" cm, Right: ");
  Serial.print(rightDistance);
  Serial.println(" cm");

In this section of code condition checks if the front distance is less than a threshold value ‘MIN_DISTANCE_BACK’ that indicates a very low distance. If this condition is true, it means that the front distance is very low, and the robot should move backward to avoid a collision. In this case, the ‘moveBackward()’ function is called.

if (frontDistance < MIN_DISTANCE_BACK) {
    moveBackward();
    Serial.println("backward");

If the previous condition is false, this condition is checked. if the front distance is less than the left distance, less than the right distance, and less than the ‘MAX_DISTANCE’ threshold. If this condition is true, it means that the front distance is the smallest among the three distances, and it is also below the maximum distance threshold. In this case, the ‘moveForward()’ function is called to make the car move forward.

else if (frontDistance < leftDistance && frontDistance < rightDistance && frontDistance < MAX_DISTANCE) {
    moveForward();
    Serial.println("forward");

If the previous condition is false, this condition is checked. It verifies if the left distance is less than the right distance and less than the ‘MAX_DISTANCE’ threshold. This condition indicates that the left distance is the smallest among the three distances, and it is also below the minimum distance threshold. Therefore, the ‘turnLeft()’ function is called to make the car turn left.

else if (leftDistance < rightDistance && leftDistance < MAX_DISTANCE) {
    turnLeft();
    Serial.println("left");

If neither of the previous conditions is met, this condition is checked. It ensures that the right distance is less than the ‘MAX_DISTANCE’ threshold. This condition suggests that the right distance is the smallest among the three distances, and it is below the minimum distance threshold. The ‘turnRight()’ function is called to make the car turn right.

else if (rightDistance < MAX_DISTANCE) {
    turnRight();
    Serial.println("right");

If none of the previous conditions are true, it means that none of the distances satisfy the conditions for movement. Therefore, the ‘stop()’ function is called to stop the car.

 else {
    stop();
    Serial.println("stop");

In summary, the code checks the distances from the three ultrasonic sensors and determines the direction in which the car should move based on the 3 ultrasonic sensors with the smallest distance.

 

Important aspects of this Arduino-powered human-following robot project include:

  • Three-sensor setup for 360-degree human identification
  • Distance measurement and decision-making in real-time
  • Navigation that operates automatically without human assistance
  • Avoiding collisions and maintaining a safe following distance

 

 

Technical Summary and GitHub Repository 

Using three HC-SR04 ultrasonic sensors and an L298N motor driver for precise directional control, this Arduino project shows off the robot's ability to track itself. For simple replication and modification, the full source code, circuit schematics, and assembly guidelines are accessible in our GitHub repository. To download the Arduino code, view comprehensive wiring schematics, and participate in the open-source robotics community, visit our GitHub page.

Code Schematics Download Icon

 

Frequently Asked Questions

⇥ How does an Arduino-powered human-following robot operate?
Three ultrasonic sensors are used by the Arduino-powered human following robot to determine a person's distance and presence. After processing this data, the Arduino manages motors to follow the identified individual while keeping a safe distance.

⇥ Which motor driver is ideal for an Arduino human-following robot?
The most widely used motor driver for Arduino human-following robots is the L298N. Additionally, some builders use the L293D motor driver shield, which connects to the Arduino Uno directly. Both can supply enough current for small robot applications and manage 2-4 DC motors.

⇥ Is it possible to create a human-following robot without soldering?
Yes, you can use motor driver shields that connect straight to an Arduino, breadboards, and jumper wires to construct a human-following robot. For novices and prototyping, this method is ideal.

⇥ What uses do human-following robots have in the real world?
Shopping cart robots in malls, luggage-carrying robots in airports, security patrol robots, elderly care assistance robots, educational demonstration robots, and companion robots that behave like pets are a few examples of applications.

 

Conclusion

This human following robot using Arduino project and three ultrasonic sensors is an exciting and rewarding project that combines programming, electronics, and mechanics. With Arduino’s versatility and the availability of affordable components, creating your own human-following robot is within reach.

Human-following robots have a wide range of applications in various fields, such as retail stores, malls, and hotels, to provide personalized assistance to customers. Human-following robots can be employed in security and surveillance systems to track and monitor individuals in public spaces. They can be used in Entertainment and events, elderly care, guided tours, research and development, education and research, and personal robotics.

They are just a few examples of the applications of human-following robots. As technology advances and robotics continues to evolve, we can expect even more diverse and innovative applications in the future.

Explore Practical Projects Similar To Robots Using Arduino

Explore a range of hands-on robotics projects powered by Arduino, from line-following bots to obstacle-avoiding vehicles. These practical builds help you understand sensor integration, motor control, and real-world automation techniques. Ideal for beginners and hobbyists, these projects bring theory to life through interactive learning.

 Simple Light Following Robot using Arduino UNO

Simple Light Following Robot using Arduino UNO

Today, we are building a simple Arduino-based project: a light-following robot. This project is perfect for beginners, and we'll use LDR sensor modules to detect light and an MX1508 motor driver module for control. By building this simple light following robot you will learn the basics of robotics and how to use a microcontroller like Arduino to read sensor data and control motors.

Line Follower Robot using Arduino UNO: How to Build (Step-by-Step Guide)

Line Follower Robot using Arduino UNO: How to Build (Step-by-Step Guide)

This step-by-step guide will show you how to build a professional-grade line follower robot using Arduino UNO, with complete code explanations and troubleshooting tips. Perfect for beginners and intermediate makers alike, this project combines hardware interfacing, sensor calibration, and motor control fundamentals.

Have any question related to this Article?

Voice Activated LED Controller with Touch Interface Using ESP32S3 Box 3

Voice control has become an integral part of modern smart home automation. In this tutorial, we build a voice-controlled LED system using the ESP32-S3-BOX-3 development board, combining wake word detection, speech recognition, touch interface, and audio feedback to create an intelligent control system. The code will be based on the factory example provided by Espressif and we will do the needed modifications to make it apt for our project.

The ESP32-S3-BOX-3 is a powerful development platform from Espressif that integrates a 320×240 touchscreen display, dual microphones for voice input, stereo speakers, and WiFi/Bluetooth connectivity. This project demonstrates how to leverage these features using the ESP-IDF (Espressif IoT Development Framework) and ESP-SR (Speech Recognition) library.

Voice Activated LED Controller

For a detailed hands-on review and getting-started walkthrough of the ESP32-S3-BOX-3 board, check out our previous articles on the same
Getting Started with ESP32-S3-BOX-3 - CircuitDigest Review
Programming ESP32-S3-BOX-3 with Arduino IDE - RGB LED Control

What You'll Learn

  • Implementing wake word detection using WakeNet
  • Building command recognition with MultiNet
  • Creating a touch-based GUI using the LVGL library
  • Playing audio feedback through the I2S interface
  • Controlling hardware (LED) through GPIO

Components Required

S.NoComponentQuantity
1ESP32-S3-BOX-3 Development Board1
2RGB LED Module 1
3Jumper WiresAs needed
4USB-C Cable (for programming and power)1

Software Requirements

  • ESP-IDF v5.5.2 - Espressif IoT Development Framework
  • Python 3.12+ - Required for ESP-IDF tools

Circuit Diagram and Connections

The circuit connection is straightforward. We connect an external LED to GPIO 40 of the ESP32-S3-BOX-3 board through a current-limiting resistor. For the ease of demonstration, we have used the RGB LED module that came with the ESP32-S3-BOX-3. We will be using the DOCK accessory to connect the LED. Insert the ESP32S3-Box-3 into the dock. Connect the GND pin of the RGB Module to any of the ground points in the dock and any one of the anode pins to the G40 port in the dock. As already mentioned, if you are using a single external LED,  connect the cathode of the LED to ground and the anode to the G40 through a current-limiting resistor. The image below shows the connection.

Voice Activated LED Controller with Touch Interface Using ESP32S3 Box 3 Circuit Diagram

Here is the ESP32S3-Box-3 with the LED attached.

 

ESP32 S3 Box Hardware Setup

Project Setup Beginner's Guide

ESP-IDF Installation

This project requires ESP-IDF v5.5.2. For full installation and configuration instructions, refer to the official Espressif Getting Started Guide:
ESP-IDF Getting Started Guide (Official)
Then make sure to get our project file from our repo using git clone or manually downloading and extracting it to your preferred location.

git clone https://github.com/Circuit-Digest/Voice-Activated-LED-Controller-with-Touch-Interface-Using-ESP32S3-Box-3

Project Configuration

1. Set up the ESP-IDF environment: Once you have properly installed and set up the ESP-IDF following Espressif's guide, on Mac or Linux systems, open a terminal and run the following command to set up the ESP-IDF environment. Make sure not to close the terminal once done, and any upcoming idf command has to be executed through the same terminal or command prompt. If you ever close the terminal, or when opening the project later, run this command first to set up the environment. This has to be done in each new section. 

. $HOME/esp/esp-idf/export.sh

On Windows PCs, you can directly run the ESP-IDF command prompt shortcut in the Start menu, created by the ESP-IDF installer.

2. Navigate to the project directory. The path you provide must be to the root folder of your project directory.

cd /path_to_your_project_directory

3. Configure the project: The menu config option is used to change or reconfigure the project parameters. It is completely optional since all required properties are already configured. But if you need, you can use the following command to access the menuconfig options.

idf.py menuconfig

4. Build the project: You can use the following script to build the project. When it's executed,  the IDF will copy any required managed components to the project folder and build the project. If any error occurs, other than related to code, it is highly recommended to do a full clean and then build.

idf.py build

5. Flash and monitor: the following command is used to flash the code to the ESp32S3-Box-3 and monitor the serial log. Make sure to connect the board to the computer before running the command. If the board is not detected, even after connecting to the computer, Press and hold the boot button and then press the reset button. Later, release the boot button and try to upload the code. Once uploaded with this method, make sure to reset the board manually once the code is uploaded.

idf.py flash monitor

Project Structure Overview

For your reference, this is the file structure of our project. The Main folder contains all the source code, while the components folder contains unmanaged component libraries, and the spiffs folder contains all the image or audio files.

Voice Activated LED Controller with Touch Interface Using ESP32S3 Box 3 File Structure

How Wake Word Detection Works

Wake word detection uses ESP-SR WakeNet, a low-power neural network engine that runs continuously in the background. The Audio Front-End (AFE), preprocesses audio from the microphone array. Sample rate: 16 kHz, 16-bit signed, 2 channels (stereo). Then the WakeNet Engine does the CNN-based wake word detection. The Wakenet framework continuously monitors the audio stream with low power consumption. It supports up to 5 wake words simultaneously. The wake word detection flow is as given below.

Microphone   ->  I2S   ->  AFE   ->  WakeNet   ->  Wake Detection Event
Wake Word Detection Flow

Detection Events

  • WAKENET_DETECTED - Wake word detected; start listening for commands.
  • WAKENET_CHANNEL_VERIFIED - Channel verified; ready for command recognition.

The following key functions are used for the wakeword detection and are called from main/app/app_sr.c.

  • audio_feed_task() - Reads audio from I2S and feeds it to AFE
  • audio_detect_task() - Processes AFE output and detects wake words
  • app_sr_start() - Initialises AFE and WakeNet models

Available Wake Words

The project supports multiple pre-trained wake words. Configure them via idf.py menuconfig.
Navigation: idf.py menuconfig  -> ESP Speech Recognition  -> Load Multiple Wake Words

Wake WordLanguageConfig Key
Hi ESP 

English

CONFIG_SR_WN_WN9_HIESP_MULTI=y
Hi LexinChinese
CONFIG_SR_WN_WN9_HILEXIN_MULTI=y
AlexaEnglish
CONFIG_SR_WN_WN9_ALEXA_MULTI=y
Xiao Ai Tong XueChinese
CONFIG_SR_WN_WN9_XIAOAITONGXUE_MULTI=y
Ni Hao Xiao ZhiChinese
CONFIG_SR_WN_WN9_NIHAOXIAOZHI_MULTI=y

How to Change Wake Words

Method 1 - Using menuconfig
1.Run idf.py menuconfig
2.Navigate to: ESP Speech Recognition -> Load Multiple Wake Words
3.Enable or disable desired wake words.
4.Save and rebuild: idf.py build flash

Method 2 - Modify Code

Wake word selection happens in app_sr.c:

// In app_sr_set_language() function (line ~235)
char *wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX,
   (SR_LANG_EN == g_sr_data->lang ? "hiesp" : "hilexin"));

To switch the English wake word to "Alexa":

char *wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX,
   (SR_LANG_EN == g_sr_data->lang ? "alexa" : "hilexin"));

Using Custom Wake Words

Requirements: A custom wake word model trained with ESP-SR tools, in ESP-SR compatible format, with sufficient model partition space.
1. Train a custom wake word using ESP-SR training tools (see ESP-SR documentation).

2. Place the generated model file (.bin) in spiffs/ or the model partition.

3. Enable the custom word in menuconfig: For eg, ESP Speech Recognition -> CONFIG_SR_WN_WN9_CUSTOMWORD

4 .Update code in app_sr.c:

char *wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX, "customword");

5. Rebuild and flash: idf.py build flash

How Speech Recognition Works

Speech recognition uses ESP-SR MultiNet, an offline command recognition engine that supports up to 200 commands without requiring cloud connectivity. Both English and Chinese are supported in the ESP-SR engine.

Wake Word Detected   ->  AFE Processing   ->  MultiNet   ->  Command ID   ->  Handler Action
Command Recognition Flow

Recognition States

  • ESP_MN_STATE_DETECTING  - Listening for a command
  • ESP_MN_STATE_DETECTED   - Command recognised
  • ESP_MN_STATE_TIMEOUT    - No command detected within timeout

Key Components

  • Command Definition (app_sr.c) - defines the text and phoneme for each command
  • Command Structure (app_sr.h) - struct holding cmd ID, language, text, and phoneme
  • Recognition Process (audio_detect_task) - AFE processes audio, MultiNet analyses chunks, returns command ID via queue to handler
// Command definition array  (app_sr.c)
static const sr_cmd_t g_default_cmd_info[] = {
   {SR_CMD_LIGHT_ON,  SR_LANG_EN, 0, "turn on light",  "TkN nN LiT", {NULL}},
   {SR_CMD_LIGHT_OFF, SR_LANG_EN, 0, "turn off light", "TkN eF LiT", {NULL}},
};

How to Modify Commands

⇒ Step 1 - Add Command Enum (app_sr.h)

typedef enum {
   SR_CMD_LIGHT_ON,
   SR_CMD_LIGHT_OFF,
   SR_CMD_MY_NEW_CMD,    //  Add your command enum
   SR_CMD_MAX,
} sr_user_cmd_t;

⇒ Step 2 - Add Command Definition (app_sr.c)

static const sr_cmd_t g_default_cmd_info[] = {
   {SR_CMD_LIGHT_ON,     SR_LANG_EN, 0, "turn on light",  "TkN nN LiT", {NULL}},
   {SR_CMD_LIGHT_OFF,    SR_LANG_EN, 0, "turn off light", "TkN eF LiT", {NULL}},
   {SR_CMD_MY_NEW_CMD,   SR_LANG_EN, 2, "my new command", "mI nU kMnd", {NULL}},  //   Add
};

⇒ Step 3 - Add Handler Action (app_sr_handler.c)

case SR_CMD_MY_NEW_CMD:        //   Add your handler
   ESP_LOGI(TAG, "My new command executed!");
   // Your action here
   break;

⇒ Step 4 - Rebuild and Flash

idf.py build flash monitor

Adding Multiple Commands

// app_sr.h - enum
SR_CMD_FAN_ON,
SR_CMD_FAN_OFF,
SR_CMD_SET_BRIGHTNESS_HIGH,
SR_CMD_SET_BRIGHTNESS_LOW,
// app_sr.c - command definitions
{SR_CMD_FAN_ON,                  SR_LANG_EN, 2, "turn on fan",      "TkN nN fN",    {NULL}},
{SR_CMD_FAN_OFF,                 SR_LANG_EN, 3, "turn off fan",     "TkN eF fN",    {NULL}},
{SR_CMD_SET_BRIGHTNESS_HIGH,     SR_LANG_EN, 4, "brightness high",  "brItns hI",    {NULL}},
{SR_CMD_SET_BRIGHTNESS_LOW,      SR_LANG_EN, 5, "brightness low",   "brItns lO",    {NULL}},

Dynamic Command Addition (Runtime)

sr_cmd_t new_cmd = {
   .cmd     = SR_CMD_MY_NEW_CMD,
   .lang    = SR_LANG_EN,
   .id      = 10,
   .str     = "my command",
   .phoneme = "mI kMnd"
};
app_sr_add_cmd(&new_cmd);
app_sr_update_cmds();   // Update MultiNet command list

API Functions (app_sr.h)

  • app_sr_add_cmd()    - Add a new command
  • app_sr_modify_cmd() - Modify an existing command
  • app_sr_remove_cmd() - Remove a command
  • app_sr_remove_all_cmd() - Clear all commands
  • app_sr_update_cmds()    - Update MultiNet with the current command list

How Display and Touch Work

The project uses LVGL (Light and Versatile Graphics Library) for GUI rendering and touch input.

  • Display Driver - ILI9341 LCD controller (320×240), SPI interface, RGB565 colour format, hardware-accelerated rendering.
  • Touch Driver - GT911 capacitive touch controller via I2C, with multi-touch support (single touch used in this project).
  • LVGL Integration - LVGL runs in a dedicated task with double buffering for smooth rendering. Touch events are handled via the LVGL input driver.

Initialisation (main.c)

bsp_display_cfg_t cfg = {
   .lvgl_port_cfg  = ESP_LVGL_PORT_INIT_CONFIG(),
   .buffer_size    = BSP_LCD_H_RES * CONFIG_BSP_LCD_DRAW_BUF_HEIGHT,
   .double_buffer  = 0,
   .flags          = { .buff_dma = true }
};
bsp_display_start_with_config(&cfg);
bsp_board_init();

Creating GUI Elements

#include "lvgl.h"
#include "bsp/esp-bsp.h"
bsp_display_lock(0);            // Lock for thread safety
lv_obj_t *scr  = lv_scr_act();  // Get current screen
// Create a button
lv_obj_t *btn  = lv_btn_create(scr);
lv_obj_set_size(btn, 100, 50);
lv_obj_align(btn, LV_ALIGN_CENTER, 0, 0);
// Add label
lv_obj_t *label = lv_label_create(btn);
lv_label_set_text(label, "Click Me");
// Add click callback
lv_obj_add_event_cb(btn, on_button_click, LV_EVENT_CLICKED, NULL);
bsp_display_unlock();

Touch Event Handling

static void on_touch_event(lv_event_t *e)
{
   lv_event_code_t code = lv_event_get_code(e);
   lv_obj_t       *obj  = lv_event_get_target(e);
   switch (code) {
   case LV_EVENT_PRESSED:
       lv_obj_set_style_bg_color(obj, lv_color_hex(0x0000FF), 0);
       break;
   case LV_EVENT_RELEASED:
       lv_obj_set_style_bg_color(obj, lv_color_hex(0x00FF00), 0);
       break;
   case LV_EVENT_CLICKED:
       light_ctrl_toggle();   // Perform action
       break;
   default: break;
   }
}

Supported Event Types

  • LV_EVENT_CLICKED      - Touch released after press
  • LV_EVENT_PRESSED      - Touch pressed
  • LV_EVENT_RELEASED     - Touch released
  • LV_EVENT_LONG_PRESSED - Long press detected

Using Images in the GUI

The project converts BMP from images stored in an array using the image_to_c tool by bitbank2, to LVGL-compatible RGB565 format at runtime using bmp_to_lv_img() in light_ui.c. If you wan you can also use the LVGL image converter tool to convert the images to c array. One other option is to store the image files in the file system and load them from there.

lv_img_set_src(img_obj, "/spiffs/image.bin");

Creating Custom GUI Screens

Here is an example code snippet showing how to create a new screen for the GUI. The LV object creation macro is used to create or define each screen.

// Screen 1: Main
lv_obj_t *main_screen = lv_obj_create(NULL);
// ... add widgets ...
// Screen 2: Settings
lv_obj_t *settings_screen = lv_obj_create(NULL);
// ... add widgets ...
// Navigate
void goto_settings(lv_event_t *e) { lv_scr_load(settings_screen); }
void goto_main(lv_event_t *e)     { lv_scr_load(main_screen);     }

Warning: Each RGB565 pixel = 2 bytes. A 320×240 screen buffer = ~150 KB. Double buffering doubles that. Consider using PSRAM for large buffers.
For more details on how to use the LVGL library, please check out the official LVGL documentation.

How Audio Output Works

Audio output uses the I2S interface with an ES8311 codec chip for digital-to-analog conversion. The I2S Driver handles audio data transfer. Sample rate: 16 kHz default for SR feedback, 16-bit, stereo (2 channels). The ES8311 codec with I2S input provides analog output to the speaker and volume and mute control.

Audio Playback Flow

WAV File   ->  Memory Buffer   ->  I2S Write   ->  Codec   ->  Speaker

Key Functions (app_sr_handler.c)

  • sr_echo_init()  - Loads WAV files from SPIFFS to memory
  • sr_echo_play()  - Plays an audio segment via I2S
  • bsp_i2s_write() - Writes audio data to I2S (BSP function)

Audio Playback Implementation

typedef enum {
   AUDIO_WAKE,   // Wake word detected tone
   AUDIO_OK,     // Command recognised tone
   AUDIO_END,    // Timeout / end tone
   AUDIO_MAX,
} audio_segment_t;
// Load WAV from SPIFFS  -> PSRAM
static esp_err_t load_wav_to_mem(audio_segment_t seg, const char *path)
{
   FILE *fp = fopen(path, "rb");
   if (!fp) return ESP_ERR_NOT_FOUND;
   fseek(fp, 0, SEEK_END);
   long sz = ftell(fp);
   fseek(fp, 0, SEEK_SET);
   s_audio[seg].buf = heap_caps_malloc(sz, MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT);
   s_audio[seg].len = (size_t)sz;
   fread(s_audio[seg].buf, 1, sz, fp);
   fclose(fp);
   return ESP_OK;
}

Adding More Audio Playbacks

⇒ Step 1 - Add Audio Segment Enum

typedef enum {
   AUDIO_WAKE,
   AUDIO_OK,
   AUDIO_END,
   AUDIO_CUSTOM_1,   //   Add your segment
   AUDIO_CUSTOM_2,
   AUDIO_MAX,
} audio_segment_t;

⇒ Step 2 - Add WAV File to SPIFFS
Place your WAV file in the spiffs/ directory. WAV requirements: uncompressed PCM, 16 kHz recommended, 16-bit, mono or stereo.

spiffs/
├── echo_en_wake.wav
├── echo_en_ok.wav
├── echo_en_end.wav
├── custom_sound_1.wav     Add here
└── custom_sound_2.wav

⇒ Step 3 - Load in Initialisation

ESP_RETURN_ON_ERROR(
   load_wav_to_mem(AUDIO_CUSTOM_1, "/spiffs/custom_sound_1.wav"),
   TAG, "load custom1 wav failed");

⇒ Step 4 - Play When Needed

sr_echo_play(AUDIO_CUSTOM_1);

⇒ Step 5 - Rebuild

idf.py build flash

The SPIFFS partition is automatically rebuilt with files from the spiffs/ directory.

Audio Format Requirements

ParameterValue
Sample Rates8, 16, 22.05, 44.1, 48 kHz
Bit Depth16-bit (recommended)
ChannelsMono or Stereo
FormatUncompressed PCM WAV

Converting Audio with FFmpeg

# Convert to 16 kHz, 16-bit, mono WAV
ffmpeg -i input.mp3 -ar 16000 -acodec pcm_s16le -ac 1 output.wav
# Convert to 16 kHz, 16-bit, stereo WAV
ffmpeg -i input.mp3 -ar 16000 -acodec pcm_s16le -ac 2 output.wav

BSP Audio API Reference

The following BSP functions (bsp_board.h) control the audio codec:

FunctionDescription
bsp_codec_set_fs()
Set codec sample rate, bit depth, and channel mode
bsp_codec_volume_set()
Set volume level (0-100)
bsp_codec_mute_set()
Mute or unmute the audio codec
bsp_i2s_write()
Write audio data buffer to I2S output
bsp_codec_dev_stop()
Stop the codec device
bsp_codec_dev_resume()
Resume the codec device

Memory Considerations

  • 16 kHz, 16-bit, mono   ->  ~32 KB per second
  • 16 kHz, 16-bit, stereo  -> ~64 KB per second
  • 44.1 kHz, 16-bit, stereo  -> ~176 KB per second

Recommendations

  • Use PSRAM for audio buffers (MALLOC_CAP_SPIRAM)
  • Pre-load frequently used sounds into memory
  • Stream long audio files from SPIFFS in 4 KB chunks

Streaming Long Audio

void play_long_audio_stream(const char *wav_path)
{
   FILE *fp = fopen(wav_path, "rb");
   if (!fp) return;
   fseek(fp, 44, SEEK_SET);     // Skip WAV header
   uint8_t chunk[4096];
   size_t  bytes_read;
   while ((bytes_read = fread(chunk, 1, sizeof(chunk), fp)) > 0) {
       size_t bytes_written = 0;
       bsp_i2s_write((char *)chunk, bytes_read, &bytes_written, portMAX_DELAY);
   }
   fclose(fp);
}

Changing the LED Pin

1. Open the file: main/app/app_led.c

2. Find this line (around line 15):

#define SINGLE_LED_GPIO  GPIO_NUM_40

3. Change it to a different pin (e.g. GPIO 38):

#define SINGLE_LED_GPIO  GPIO_NUM_38

4. Save the file.

5. Rebuild and flash:

idf.py build flash monitor

6. Test: Connect your LED to GPIO 38 instead of GPIO 40.

Building & Flashing

Once the hardware is connected and the software is set up, follow these steps to compile and upload the code.
Step 1 - Navigate to the Project Directory

cd /path/to/esp32-box3-voice-led-project

⇒ Step 2 - Activate ESP-IDF Environment

. $HOME/esp/esp-idf/export.sh

⇒ Step 3 - Configure (Optional)

idf.py menuconfig

⇒ Step 4 - Build

idf.py build

This compiles all source files and creates the firmware binary. The first build may take several minutes as dependencies are downloaded.
Step 5 - Flash and Monitor

idf.py flash monitor

*Tip: Press Ctrl+] to exit the serial monitor.

Final Result

After successfully flashing the firmware, the ESP32-S3-BOX-3 boots and displays the light control screen. Now we can control the LED with two different methods.
The first method is to use the voice commands. To use it:

1. Say the wake word: "Hi ESP" (speak clearly, about 1 metre from the device).

2. Wait for audio feedback - you'll hear a confirmation sound.

3. Speak the command: "Turn on light" or "Turn off light".

4. Observe: the LED changes state, the screen updates, and audio feedback plays.

5. Once the wake word is detected, you can continuously give commands without using the wake word. If you haven't provided any commands for a certain time(a few seconds), the ESP-SR engine will time out. To use it again, all you have to do is say the wake word again to trigger the wake word detection.

The second method is to use the touch screen. For that:
1. Touch the on-screen toggle button.
2. Observe: the LED toggles and the button image changes.
Here is the final result:

Troubleshooting

Wake Word Not Detected

  • Speak louder and clearer, at 0.5-1 metre from the device.
  • Reduce background noise.
  • Check the serial monitor for AFE initialisation errors.

LED Doesn't Light Up

  • Verify LED polarity 
  • Verify the GPIO 40 connection.
  • Test with a multimeter: GPIO should read 3.3 V when ON.

Build Errors

  • Ensure ESP-IDF v5.5.2 is correctly installed.
  • Run . $HOME/esp/esp-idf/export.sh before building.
  • Do a full clean rebuild: idf.py fullclean  && idf.py build.

Touch Screen Not Responding

  • Check the serial monitor for LVGL initialisation messages.

No Audio Feedback

  • Ensure WAV files are in the spiffs/ directory before building.
  • Check speaker volume (may need physical adjustment).
  • Verify I2S initialisation in serial logs.

GitHub Link

Find the project’s codebase and documentation here. Explore, fork, and contribute on GitHub.

Voice Activated LED Controller with Touch Interface Using ESP32S3 Box 3 FileVoice Activated LED Controller with Touch Interface Using ESP32S3 Box 3 File

Have any question related to this Article?