Oriented Bounding Boxes Object Detection

Oriented Bounding Boxes Object Detection1. Model introduction2. Oriented bounding box object detection: imageEffect preview3. Directed border object detection: videoEffect preview4. Directed border object detection: real-time detection4.1. USB cameraEffect preview4.2, CSI cameraEffect previewReferences

Use Python to demonstrate the effect of Ultralytics: Oriented Bounding Boxes Object Detection in image, video, and real-time detection.

1. Model introduction

Oriented bounding box object detection, also known as oriented object detection, goes a step further than standard object detection. It introduces an additional angle to more accurately locate objects in the image.

The output of the oriented object detector is a set of rotated bounding boxes that accurately surround objects in the image, as well as the category label and confidence score of each bounding box. Oriented bounding boxes are particularly useful when objects appear at different angles, such as in aerial images, where traditional axis-aligned bounding boxes may include unnecessary background.

In short, oriented object detection can accurately surround tilted objects with tilted boxes, thereby reducing unnecessary background areas and improving detection accuracy.

2. Oriented bounding box object detection: image

Use yolo11n-obb.pt to predict images under the ultralytics project (not ultralytics built-in images).

Enter the code folder:


cd /home/pi/ultralytics/ultralytics/yahboom_demo

Run the code:


xxxxxxxxxx
python3 05.obb_image.py

Effect preview

Yolo recognition output image location: /home/pi/ultralytics/ultralytics/output/

Sample code:


x
from ultralytics import YOLO
# Load a model
model = YOLO("/home/pi/ultralytics/ultralytics/yolo11n-obb.pt")
# Run batched inference on a list of images
results = model("/home/pi/ultralytics/ultralytics/assets/car.jpg")  # return a list of Results objects
# Process results list
for result in results:
    # boxes = result.boxes  # Boxes object for bounding box outputs
    # masks = result.masks  # Masks object for segmentation masks outputs
    # keypoints = result.keypoints  # Keypoints object for pose outputs
    # probs = result.probs  # Probs object for classification outputs
    obb = result.obb  # Oriented boxes object for OBB outputs
    result.show()  # display to screen
    result.save(filename="/home/pi/ultralytics/ultralytics/output/car_output.jpg")  # save to disk

3. Directed border object detection: video

Use yolo11n-obb.pt to predict the video under the ultralytics project (not the video that comes with ultralytics).

Enter the code folder:


xxxxxxxxxx
cd /home/pi/ultralytics/ultralytics/yahboom_demo

Run the code:


xxxxxxxxxx
python3 05.obb_video.py

Effect preview

Video location of yolo recognition output: /home/pi/ultralytics/ultralytics/output/

Sample code:


xxxxxxxxxx
import cv2
from ultralytics import YOLO
# Load the YOLO model
model = YOLO("/home/pi/ultralytics/ultralytics/yolo11n-obb.pt")
# Open the video file
video_path = "/home/pi/ultralytics/ultralytics/videos/street.mp4"
cap = cv2.VideoCapture(video_path)
# Get the video frame size and frame rate
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
# Define the codec and create a VideoWriter object to output the processed video
output_path = "/home/pi/ultralytics/ultralytics/output/05.street_output.mp4"
fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # You can use 'XVID' or 'mp4v' depending on your platform
out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))
# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()
    if success:
        # Run YOLO inference on the frame
        results = model(frame)
        # Visualize the results on the frame
        annotated_frame = results[0].plot()
        # Write the annotated frame to the output video file
        out.write(annotated_frame)
        # Display the annotated frame
        cv2.imshow("YOLO Inference", cv2.resize(annotated_frame, (640, 480)))
        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if the end of the video is reached
        break
# Release the video capture and writer objects, and close the display window
cap.release()
out.release()
cv2.destroyAllWindows()

4. Directed border object detection: real-time detection

4.1. USB camera

Use yolo11n-obb.pt to predict the USB camera screen.

Enter the code folder:


xxxxxxxxxx
cd /home/pi/ultralytics/ultralytics/yahboom_demo

Run the code: Click the preview screen and press the q key to terminate the program!


xxxxxxxxxx
python3 05.obb_camera_usb.py

Effect preview

Yolo recognizes the output video location: /home/pi/ultralytics/ultralytics/output/

Sample code:


xxxxxxxxxx
import cv2
from ultralytics import YOLO
# Load the YOLO model
model = YOLO("/home/pi/ultralytics/ultralytics/yolo11n-obb.pt")
# Open the cammera
cap = cv2.VideoCapture(0)
# Get the video frame size and frame rate
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
# Define the codec and create a VideoWriter object to output the processed video
output_path = "/home/pi/ultralytics/ultralytics/output/05.obb_camera_usb.mp4"
fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # You can use 'XVID' or 'mp4v' depending on your platform
out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))
# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()
    if success:
        # Run YOLO inference on the frame
        results = model(frame)
        # Visualize the results on the frame
        annotated_frame = results[0].plot()
        # Write the annotated frame to the output video file
        out.write(annotated_frame)
        # Display the annotated frame
        cv2.imshow("YOLO Inference", cv2.resize(annotated_frame, (640, 480)))
        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if the end of the video is reached
        break
# Release the video capture and writer objects, and close the display window
cap.release()
out.release()
cv2.destroyAllWindows()

4.2, CSI camera

Use yolo11n-obb.pt to predict the CSI camera screen.

Enter the code folder:


xxxxxxxxxx
cd /home/pi/ultralytics/ultralytics/yahboom_demo

Run the code: Click the preview screen, press the q key to terminate the program!


xxxxxxxxxx
python3 05.obb_camera_csi.py

Effect preview

Yolo recognizes the output video location: /home/pi/ultralytics/ultralytics/output/

Sample code:


xxxxxxxxxx
import cv2
from ultralytics import YOLO
from jetcam.csi_camera import CSICamera
# Load the YOLO model
model = YOLO("/home/pi/ultralytics/ultralytics/yolo11n-obb.pt")
# Open the camera (CSI Camera)
cap = CSICamera(capture_device=0, width=640, height=480)
# Get the video frame size and frame rate
frame_width = 640
frame_height = 480
fps = 30
# Define the codec and create a VideoWriter object to output the processed video
output_path = "/home/pi/ultralytics/ultralytics/output/05.obb_camera_csi.mp4"
fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # You can use 'XVID' or 'mp4v' depending on your platform
out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))
# Loop through the video frames
while True:
    # Read a frame from the camera
    frame = cap.read()
    if frame is not None:
        # Run YOLO inference on the frame
        results = model(frame)
        # Visualize the results on the frame
        annotated_frame = results[0].plot()
        # Write the annotated frame to the output video file
        out.write(annotated_frame)
        # Display the annotated frame
        cv2.imshow("YOLO Inference", cv2.resize(annotated_frame, (640, 480)))
        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if no frame is received (camera error or end of stream)
        print("No frame received, breaking the loop.")
        break
# Release the video capture and writer objects, and close the display window
cap.release()
out.release()
cv2.destroyAllWindows()

References

https://docs.ultralytics.com/tasks/obb/