Build an artificial intelligence camera with Python

Build an artificial intelligence camera with Python

Today, we will build a camera based on deep learning by ourselves. When a bird appears in the camera screen, it will be able to detect the bird and take pictures automatically. The pictures taken in the final product are as follows:

The camera is not stupid, it can be smart

We are not going to integrate a deep learning module into the camera. Instead, we are going to "hook up" the Raspberry Pi to the camera and then send photos via WiFi. Based on "all from simplicity" (poor) as the core, we only plan to build a conceptual prototype similar to DeepLens today, and interested students can try it out by themselves.

Next, we will use Python to write a web server. Raspberry Pi will use this web server to send photos to the computer or perform behavior inference and image detection.

The computer we use here will have stronger processing capabilities. It will use a neural network architecture called YOLO to detect the input image and determine whether the bird appears in the camera image.

We have to start with the YOLO architecture, because it is currently one of the fastest detection models. This model specifically leaves an interface for Tensorflow (Google's second-generation artificial intelligence learning system based on DistBelief), so we can easily install and run this model on different platforms. Friendly reminder, if you are using the mini model we used in this article, you can also use the CPU for detection instead of just relying on the expensive GPU.

Next back to our conceptual prototype... If a bird is detected in the frame, then we will save the picture and proceed to the next analysis.

Detection and photo

But for things like the Raspberry Pi, we don't actually need to use its computing power for real-time calculations. Therefore, we are going to use another computer to infer what appears in the image.

I am using a simple Linux computer with a camera and WiFi wireless network card (Raspberry Pi 3 + camera), and this simple device will serve as my deep learning machine and perform image inference. For me, this is the most ideal solution at present, which not only greatly reduces my costs, but also allows me to complete all the calculations on the desktop.

We need to use Flask to build a web server so that we can get images from the camera.

from import lib import import_module
import os
from flask import Flask, render_template, Response
#uncomment below to use Raspberry Pi camera instead
#from camera_pi import Camera
#comment this out if you're not using USB webcam
from camera_opencv import Camera
app =Flask(__name__)
@app.route('/')
def index():
 return "hello world!"
def gen2(camera):
 """Returns a single imageframe"""
 frame = camera.get_frame()
 yield frame
@app.route('/image.jpg')
def image():
 """Returns a single currentimage for the webcam"""
 return Response(gen2(Camera()),mimetype='image/jpeg')
if __name__ =='__main__':
app.run(host='0.0.0.0', threaded=True)

If you are using a Raspberry Pi video camera, please make sure that the from camera_pi line in the above code is not commented out, and then comment out the from camera_opencv line.

You can directly use the command python3 app.py or gunicorn to run the server, which is the same as the method written by Miguel in the documentation. If we use multiple computers for image inference, we can also use the camera management solution developed by Miguel to manage cameras and computing threads.

When we start the Raspberry Pi, we first need to determine whether the server is working properly based on the IP address, and then try to access the server through a web browser.

Load web pages and images in the Raspberry Pi to determine whether the server is working properly:

Image import and inference

Now that we have set up the terminal to load the current image content of the camera, we can build a script to capture the image and infer the content in the image.

Here we need to use the request library (an excellent Python library for obtaining file resources from URL addresses) and Darkflow (the YOLO model is based on the implementation of Tensorflow).

Unfortunately, we can't use methods like pip to install Darkflow, so we need to clone the entire code base, and then build and install the project by ourselves. After installing the Darkflow project, we also need to download a YOLO model.

Because I am using a slower computer and onboard CPU (rather than a faster GPU), I chose to use the YOLO v2 mini network. Of course, its function is definitely not as accurate as the complete YOLO v2 model!

After the configuration is complete, we also need to install Pillow, numpy and OpenCV on the computer. Finally, we can completely complete our code and perform image detection.

The final code is as follows:

from darkflow.net.build import TFNet
import cv2
from io import BytesIO
import time
import requests
from PIL import Image
import numpy as np
options= {"model": "cfg/tiny-yolo-voc.cfg", "load":"bin/tiny-yolo-voc.weights", "threshold": 0.1}
tfnet = TFNet(options)
birdsSeen = 0
def handleBird():
 pass
whileTrue:
 r =requests.get('http://192.168.1.11:5000/image.jpg') # a bird yo
 curr_img = Image.open(BytesIO(r.content))
 curr_img_cv2 =cv2.cvtColor(np.array(curr_img), cv2.COLOR_RGB2BGR)
 result = tfnet.return_predict(curr_img_cv2)
 print(result)
 for detection in result:
 if detection['label'] =='bird':
 print("bird detected")
 birdsSeen += 1
 curr_img.save('birds/%i.jpg' %birdsSeen)
 print('running again')
time.sleep(4)

At this point, not only can we view the content detected by the Raspberry Pi in the command console, but we can also view the saved bird photos directly on the hard disk. Next, we can use YOLO to mark the birds in the picture.

The balance between false positives and false negatives

We set a threshold key in the options dictionary of the code. This threshold represents a certain success rate for detecting images. During the test, we set it to 0.1, but such a low threshold will give us a higher false positive and false positive rate. To make matters worse, the accuracy of the mini YOLO model we used is far worse than the complete YOLO model, but this is also a balancing factor that needs to be considered.

Lowering the threshold means that we can get more model output (photos). In my test environment, I set the threshold lower because I want to get more bird photos, but you can adjust the threshold according to your needs. parameter.

*Disclaimer: This article is organized on the Internet, and the copyright belongs to the original author. If the source information is wrong or infringes on rights, please contact me

Reference: https://cloud.tencent.com/developer/article/1537100 Build an artificial intelligence camera with Python-Cloud + Community-Tencent Cloud