FlaminGO-Light Python SDK
1. Getting Started
1.1 Move the Robot
Press the robot’s power button and wait until you hear the second buzzer sound. The second buzzer means the robot has finished booting and is ready to be controlled.
Once the robot is ready, press and hold LT on the joystick to stand it up.
You can control the robot as follows:
Hold LT
Stand up
Hold RT
Sit down
Left joystick
Move the robot forward, backward, or sideways
Press LT + RT once
Hold the motors at their current positions. This can be used as an emergency stop.
Press LT + RT once again
Set all motors to idle
1.2 Connect via SSH
When the robot is powered on, it automatically starts a Wi-Fi hotspot. Connect to the robot through this Wi-Fi network, then open an SSH session.
XXXX is the unique 4-character tag assigned to each robot.
# 1) Connect to Wi-Fi
# Join the COCELO_ROBOT_XXXX network.
# 2) Connect via SSH
ssh cocelo@XXXX # XXXX = the same tag shown in the Wi-Fi SSID
# Password: 000000001.3 Upgrade Robot Firmware
This SDK requires robot firmware version 0.1.4 or later. Before using the SDK, upgrade the robot firmware using the steps below. If the robot has an additional LAN port, connect that port to an external internet connection using a LAN cable, then install the SDK. (if a wired LAN connection is not available, see 10. Updating the SDK for the Wi-Fi setup steps.)
Download
Install
1.4 Install the SDK
If the robot has an additional LAN port, connect that port to an external internet connection using a LAN cable, then install the SDK.
To update an existing installation to the latest version, run:
If the robot does not have an additional LAN port, or if a wired LAN connection is not available, see 10. Updating the SDK for the Wi-Fi setup steps.
1.5 View Logs
2. Basic Usage
The example below shows the typical structure for running a policy with the SDK. In your own project, update the policy path and RlConfig values to match your trained policy.
The control loop follows this order:
3. Core Components
The main SDK components used for policy execution are:
Robot
Reads robot observations, sets motor gains, sends motor commands, and exits control mode safely
Joystick
Converts joystick input into a command vector
RlConfig
Defines observation order, scaling, history length, policy path, and other RL settings
RL
Builds the policy input state from obs and cmd, then selects an action
control_rate
Runs the control loop at a fixed frequency
4. Robot
RobotRobot is the main interface for reading sensor observations and sending motor commands.
Constructor
set_gains(kp, kd)
set_gains(kp, kd)Sets the PD gains for the four motors.
Both kp and kd must have length 4.
0
Left joint motor
1
Right joint motor
2
Left wheel motor
3
Right wheel motor
Call set_gains() before calling do_action().
get_obs()
get_obs()Reads the current robot observation.
The returned value is a dict[str, list[float]]. The commonly used observation keys are:
dof_pos
2
Joint positions
dof_vel
4
Joint and wheel velocities
dof_tau
4
Motor torque readings
ang_vel
3
Angular velocity
proj_grav
3
Projected gravity
do_action(action, torque_ctrl=False)
do_action(action, torque_ctrl=False)Sends a command to the four motors.
action must have length 4.
0
Left joint motor position command
1
Right joint motor position command
2
Left wheel motor velocity command
3
Right wheel motor velocity command
When torque_ctrl=True, the action values are sent as torque commands. Make sure this matches how the policy was trained.
enable_user_mode()
enable_user_mode()Explicitly enables user control mode.
In most cases, you do not need to call this manually. It is enabled automatically when needed during normal control.
finish()
finish()Exits user control mode cleanly.
When you use control_rate(), finish() is called automatically when the loop stops.
5. Joystick
JoystickJoystick converts joystick input into the cmd_vector used by the RL policy.
Constructor Arguments
cmd_mapping
[]
The joystick keys to place into the command vector, in order
cmd_range
None
Output range for each joystick key
Available Command Keys
The joystick command slots are:
left_x
Left stick X axis
left_y
Left stick Y axis
right_x
Right stick X axis
right_y
Right stick Y axis
dpad_x
D-pad left/right
dpad_y
D-pad up/down
left_btn
Left button input
right_btn
Right button input
Default Command Settings
If cmd_mapping and cmd_range are omitted, the following settings are used.
cmd_vector[0]
left_y
(-0.5, 0.5)
cmd_vector[1]
right_x
(-1.5, 1.5)
cmd_vector[2]
left_x
(-1.0, 1.0)
cmd_vector[3]
right_y
(-1.0, 1.0)
cmd_vector[4]
dpad_x
(-1.0, 1.0)
cmd_vector[5]
dpad_y
(-1.0, 1.0)
cmd_vector[6]
left_btn
(0.0, 1.0)
cmd_vector[7]
right_btn
(0.0, 1.0)
cmd_mapping defines the order of the command vector. cmd_range defines the output range for each key, such as left_y or right_x. In other words, ranges are applied by key name, not by index.
Custom Mapping
With this mapping, the command vector is arranged as follows:
cmd_vector[0]
left_y
(-0.5, 0.5)
cmd_vector[1]
right_x
(-1.5, 1.5)
cmd_vector[2]
left_x
(-1.0, 1.0)
cmd_vector[3]
right_y
(-1.0, 1.0)
The order in cmd_mapping becomes the order of cmd_vector. Make sure it matches the command order used when training the policy.
Keys not included in cmd_mapping may still be assigned internally, but get_cmd() returns a cmd_vector whose length matches the user-defined cmd_mapping.
Mapping Rules
cmd_mapping must satisfy the following rules:
It can contain up to 8 keys.
Each key must be one of the supported command keys.
The same key cannot be used more than once.
Custom Range
For regular command keys, use a (min, max) tuple or list.
dpad_step_x and dpad_step_y are special keys that use a single numeric value.
Range rules:
cmd_rangekeys must be supported command keys,dpad_step_x, ordpad_step_y.Regular command keys must use a tuple or list of length 2.
The min and max values must be finite numbers.
The min value must not be greater than the max value.
dpad_step_xanddpad_step_ymust be finite numbers greater than or equal to 0.
get_cmd()
get_cmd()You can access the command vector as follows:
If no joystick command is received within the timeout, a zero vector with the same length as cmd_mapping is returned.
6. RlConfig
RlConfigRlConfig defines and validates the settings required for policy execution, including the policy file path, observation order, observation lengths, history length, scaling, and action clipping.
Basic Example
Config Keys
stacked_obs_order
Optional
Observation keys to stack over history
non_stacked_obs_order
Optional
Observation keys used only from the current step
obs_scale
Optional
Scale applied to each observation
cmd_scale
Optional
Convenience key handled as obs_scale["command"]
action_scale
Optional
Scale applied to the policy output action. Defaults to 1.0 for each action dimension
history_length
Optional
Number of stacked observation frames. Default is 2
stack_size
Optional
Legacy key. Converted to history_length = stack_size + 1
policy_path
Required
Path to the .onnx policy file
policy_type
Optional
"MLP" or "LSTM". Default is "MLP"
cmd_vector_length
Optional
Length of the command observation. Default is 0
clip_actions
Optional
Disabled with None or False; use a positive number to clip the policy output
extra_obs
Optional
User-defined observation keys and lengths
Default Observation Keys and Lengths
RlConfig supports the following default observation keys.
dof_pos
2
dof_vel
4
lin_vel
3
ang_vel
3
proj_grav
3
last_action
4
command
cmd_vector_length
The length of command is determined by cmd_vector_length.
extra_obs
extra_obsUse extra_obs to add custom observation values.
Rules for extra_obs:
The key must be a non-empty string.
The length must be an integer greater than or equal to 1.
The name must not conflict with a default observation key.
Add the same keys to obs inside the control loop.
history_length and stack_size
history_length and stack_sizeUse history_length for new configurations.
history_length must be at least 1.
The legacy key stack_size is also accepted and is converted as follows:
Do not set both history_length and stack_size at the same time.
policy_path
policy_pathpolicy_path is required.
The path must point to an existing .onnx file.
policy_type
policy_typeSupported values are:
"MLP""LSTM"
Most examples can omit this field. The default is "MLP".
obs_scale
obs_scaleUse obs_scale to scale observation values.
A scalar applies the same scale to every element of that observation.
A list applies per-element scaling. The list length must match the observation length.
Observation keys not listed in obs_scale use a default scale of 1.0.
action_scale
action_scaleaction_scale is multiplied by the policy output action.
action_scale can be either a scalar or a list of length 4. If omitted, [1.0, 1.0, 1.0, 1.0] is used.
RL.select_action() applies it as follows:
The scaled action is the value passed to robot.do_action().
clip_actions
clip_actionsclip_actions controls whether the policy output is clipped.
To enable clipping, set a positive number.
Read-only Attributes
After creating an RlConfig, you can inspect the following attributes.
obs_lengths
Observation length by key
history_length
History length
state_length
Final state vector length
action_length
Action length, usually 4
policy_type
Policy type
policy_path
Policy file path
stacked_obs_order
Stacked observation order
non_stacked_obs_order
Non-stacked observation order
action_scale
Normalized action scale
clip_actions
Action clipping setting
obs_scale
Normalized observation scale
cmd_vector_length
Command vector length
7. RL
RLRL uses RlConfig to build the policy input state from obs and cmd, then selects an action using the configured policy.
set_config(config)
set_config(config)Call set_config() before using build_state() or select_action().
build_state(obs, cmd, last_action=None)
build_state(obs, cmd, last_action=None)Converts obs and cmd into a 1D state vector.
Input Format
You can pass the joystick output directly.
Command Length
If "command" is included in stacked_obs_order or non_stacked_obs_order, cmd["cmd_vector"] must be at least as long as cmd_vector_length.
For example, this configuration requires a command vector of length 4 or more:
State Layout
The state is built in the following order:
For example:
Conceptually, the state is arranged as:
On the first call to build_state(), the current frame is copied into the full history. On later calls, older frames are shifted back and the newest frame is placed at the front.
last_action
You can provide last_action manually.
When provided, it must have length 4.
If last_action is omitted, RL manages it internally. The value used for last_action is the raw policy output from the previous step, before action_scale is applied.
select_action(state)
select_action(state)Runs the policy and returns the scaled action.
The returned action has length 4 and is typically passed directly to robot.do_action().
8. control_rate
control_ratecontrol_rate() is a decorator that runs the control loop at a fixed frequency.
Arguments
robot
The Robot instance to control
hz
Control loop frequency
Example:
9. Examples
9.1 Read Observations
9.2 Read Joystick Commands
9.3 Run an RL Policy
9.4 Add Custom Observations
10. Updating the SDK
Updating the SDK requires an internet connection. If the robot has a LAN port, connect it to an external internet connection using a LAN cable, then run:
If the robot does not have a LAN port, or if wired LAN is not available, connect to the internet over Wi-Fi before updating.
Last updated