PDF Version of This Proposal:
http://meiyou.org/test2011/GSoC/GSoC_2011_Proposal_Multiple_Camera_Support.pdf

Multiple Camera Support for CCV

Google Summer of Code 2011 Proposal

Yishi Guo

ABSTRACT

In this proposal, the multi-cam support module for CCV will be introduced. By integrating the methods related to Camera Calibration and 3D Reconstruction of OpenCV into CCV, we can make the CCV track the blobs with the relationship between the multi-cams and the screen, and then we can track the objects and fiducials by stitching the undistorted and rectified captured by multi-cams.

PROBLEM DESCRIPTION

CCV does not currently support multi-cam fundamentally.[1] So when confronted with large-size screen, users have to turn to a PS3 multi-cam license or choose a wide-angel lens that will result in a fish-eye effect. Obviously neither of them would be a good choice for most of us.[2]

So we're desperate for implementing the multi-cam support for CCV.

THEORY

Camera Calibration and 3D Reconstruction from OpenCV[3]

The functions from OpenCV can be applied to:

PROCESS

Step 1

Use new multi-cam user interface to let users configure the number of cameras.

The "Multi-cam settings" button will be added to the main CCV interface. When user click the button, the new GUI will display, may be like this:

First, what we have to do is to configure the number of cameras:

Then the upper-left area will be dynamically changed:

Step 2

Select cameras in the optional device list and put them in order.

You can select the camera by the horizontal scroll bar or the arrow button:

And then configure each one of them:

Step 3

Run the calibration process and follow the simple directions by touching the calibration points.

When finishing the alignment of all the cameras, the program will run the calibration process.

In Figure 6, the cyan area represents the view of camera A, the yellow area represents the view of camera B and the area in fuchsia represents visible area of camera C. Each overlapping area contains the view of two or more cameras, for instance, the green area and the red area.

As seen in Figure 6, there are three cameras working on it. All the blobs captured by these 3 cameras, are in the overlapping areas.

The Program indicates the user to touch the circle, (i.e. Figure 7, the white circle on the 1st touch-point), after running the calibration process (by pressing "C" to run that process).

In the mean time, Camera A can capture the blob while camera B and camera C can not. (It is obvious that they can not "see" that blob). As shown in figure 8.

The program will save the coordinate of the first blob(0, 0) according to the location of camera A into the coordinate data and repeat the previous steps when it comes to the second point. (Only camera A can capture the blob)

But it could be a little different when the user touches the 3rd point from the left. Now both camera A and camera B can capture the blob (As shown in Figure 9) which will be saved into the coordinate data, separately. (The program has dynamically allocated memory with a length of GRID_POINTS1 to store the blobs data for each camera)

Now the point(0, 2) on the screen is stored both in the memory space for camera A and camera B.

All the touch-points will be calibrated by using a specific order in calibrating (S Order)[4].

The result is shown in figure 10.

Now the relationship between the cameras and the screen has been detected.

Step 4

Get parameters (camera intrinsics matrix and distortion coefficients) from previous calibration steps.

The program begins to iterate all the image data of each camera after finishing the calibration process.

When a blob exists, the program uses the data obtained before, to map the coordinates of the points on screen, according to “the relationship of triangle”[5] and records this mapping as temporary data.

If this screen coordinate has been mapped by the other cameras before completing the iteration, an algorithm (e.g. obtaining the median value) will be applied in the program to figure out the relative coordinate of the point to the screen.

As shown in figure 11:

At this point, the blobs tracking over multi-cam is implemented.

Step 5

Use these parameters to undistort and rectify the images captured by cameras. Then stitch and align the multi-cam's images into one image.

For now it is just a confirmation of the relationship between the multi-cam and the screen. When all the images captured by the multi-cams are not stitched into one integrated image, we can not fulfill the need for the recognition of Object and Fiducials.

By using methods related to Camera Calibration and 3D Reconstruction in OpenCV, as shown below in figure 12, we can use chessboard to undistort the camera:

OpenCV uses the chessboard corners to get the distortion parameters of the cameras. The relationship between the multi-cam and the screen is the data we could use for the moment.

In order to implement the calibration of the image of each camera and get the intrinsics parameters and calibration parameters of each camera, we should replace the data source from chessboard corners data to matrix data. We can get undistorted images of each camera through the functions (cvCalibrateCamera2, cvInitUndistortMap, cvRemap etc.)

In the following example, the image has been transformed from an undistorted image to a rectified image by using an algorithm similar to "the bird's-eye view transform" method.

As shown below:

With the calibrated images of each camera and the blob data obtained before, we are now able to stitch all the images into an integrated image .

In the end, the blobs tracking, and the objects and fiducials tracking, are implemented by this integrated image.

Step 6

Deliver the source image, which was generated from the stitching and alignment process, to CCV.

FUTURE WORK

ADVANTAGES FOR NUIGROUP/DELIVERABLES

TIMELINE

PERSONAL INFORMATION

General Information

Name: Yishi Guo

Email: guoyishi@gmail.com

Location/Timezone: Changchun, Jilin, China/ UTC+8

Website: http://meiyou.org/ (Simplified Chinese)

Age: 23

Education/Qualifications

Academic and Industry Background

I have coded more than 42,034 lines since entering the college. I'm specialized in C/C++, PHP, OpenCV and other image processing programming.

Because of my specialization in OpenCV, I recently obtained an internship position in a interactive media company as an internship. I am currently studying multi-touch projects and developing multi-touch based products there, and we have just brought out a 88'' LLP multi-touch table.

Open Source Development Experience

REFERENCE

APPENDIX

I have tested the multi-cam capture in Windows XP using the vidGrabber class of openframeworks. In the Windows platform the vidGrabber uses the Directshow to get the capture of cameras.

As shown below, we can capture 4 (or more) cameras at the same time.

I'm glad to receive your feedback. If you want to do that, please go to this post:
http://nuigroup.com/forums/viewthread/12301/