OpenCV & Python
OpenCV is the goto library for many image and video projects in Python, I remember struggling a bit with the documentation when starting out, so here’s a guided coding tour that I hope helps you.
Install OpenCV: <= Search for your particular environment, but this usually works:pip install opencv-python
What is it ?
OpenCV is a lot of things mostly dealing with processing images and video on your computer, my current interests are AI and real time applications, so this is biased towards that; for starters let’s look at simply capturing video from your webcam :
The above script consists of creating a video capture and then feeding that into a loop where frames are read and displayed one by one with imshow, the conditional checks for the exit command (type q to exit), cap.release and cv2.destroyAllWindows outside of the loop then deal with final cleanup. The result should be a video feed with the default OpenCV UI:
If you notice the bottom part of the window you will see a readout with coordinates
(x=629, y=112) and 3 changing numbers
R: G: B: , this means that at those mouse coordinates the image is coded in 3 channels
BLUE with values in each channel that go from
0–255 and this constitutes a colorspace called
There are a number of color spaces and transformations available, and the subject is worthy of its own post, for now let’s just transform the previous example to another colorspace:
The colorspace transformation happens in the line :gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY) and then gets shown in the imshow('frame', gray)
Note that we now have one single value or channel
L that represents grayscale values from
Transformations and Operations.
You will find that most examples out there use a cascade (progressive steps) of transformations and operations within the main loop, so for instance here’s a more complex set:
Note how we operate on each frame and then pass it to the next operation gray->denoised->thresholded->mirrored->imshow. The docs are a good place to start if you want to know what other operations you can do, but most likely you'll want to experiment on your own by stacking them or making your own, in any case here's what the above gives you:
Here the values are binary,
0 or 255for black and white.
Getting to those pixels.
Depending on what colorspace you are using, image data will be represented in a multidimensional matrix like fashion which sounds very complicated but using numpy you can first check the shape of your images and then extract the information you want, for the previous 3 examples the shapes are as follows:
print(frame.shape) >> Gives the matrix shape Color : (480, 640, 3)GRAY : (480, 640)B/W : (480, 640)Note that the color capture has 3 Dimensions or channels while the gray and Black & White have one, here's a zoomed in comparison with the channels overlaid and pixel values:
The rest is accessing regions of interest depending on your use case, the documentation does a good job explaining some basic cases so I’ll point you there , we will skip ahead to some issues you are bound to encounter before moving on to the AI part.
Out of the box openCV provides very simple UI components you can use to prototype, here’s an example:
Here we simply add an empty callback function (you can move the loop logic here if you want to), create track-bars and then tie those values to an OpenCv function.For a full explanation see here: Changing the contrast and brightness of an image.
There are a few other things you can add like buttons, annotations and drawings on top of your images/videos ( more on this later). While awesome and practical the included UI features might not be enough for your project so you will need to use another UI package like PyQT/Pyside (<= see this post or the page repo for an example) or something else, you can even use a game engine like pygame :
This might look complicated, but it is basically the same structure from all the previous examples, the only difference ( beyond having 4 mirrored images) is that the OpenCV loop and gui bits have been replaced by pygame ones...
Use in AI
Beyond basic image and video manipulation, OpenCV is a popular (the only ?) gateway to Machine Learning and Computer Vision in python, once more out of the box there is a lot on offering, take for instance object detection:
This is still the same structure we've been using all along, there are 2 new things though, the first is the use of a circle detector cv2.HoughCircles and the second is the logic block that draws on top of the found circles with cv2.circle. For a more detailed explanation see here: Hough Circle Transform
After this you might want to try a more robust or complex Machine Learning or AI method or technique, for instance you could pair OpenCV with a Neural Network framework like Keras.
I hope this serves as an overview to the existing documentation and helps you get started with OpenCv and Python.
Thanks for reading !