Panoramas (Alignment, Stitching, Blending)
The idea behind this project was to take a set of images (about 15-20) and create a panorama of a full 360 degree scene. The code written previously to detect and describe features (I've described that fully elsewhere on the website) can serve as the foundation of a panorama generator. The code shown here is just an outline for the panorama generating portions of the pipeline.
Overview + Major Design Choices:
The general process is as follows:
- Warp each image into spherical coordinates.
- Determine the features for each warped image. (click HERE to see Feature Detection and HERE to see feature Description)
- Match features between every pair of adjacent images. (click HERE to see Feature Matching)
- Align every pair of adjacent images using RANSAC.
- Blend the aligned images to create a panorama.
A few different panorama's were generated and the results are shown below. The first is the result of images available in similar project packages around the web, I did not take the source images for the 1st panorama. The 2nd and 3rd panoramas are of the main entrance to Wash U, with and without me in the photos. The 3rd and 4th panoramas are of the Kemper art museum on Wash U campus with and without my friend Robert in the photos. The alignment and corrections are working well, but this could use a more robust mechanism to account for changes in illumination between adjacent photos.
Removing Radial Distortion & Warping Images to Spherical Coordinates:
The following method contains all the logic for warping an image to spherical coordinates as well as applying a
Aligning Warped Images using RANSAC:
The followings code assumes the images warped by the previous code have undergone feature detection and description. That process could be accomplished using code I've written in previous projects, or using state of the art approaches like SIFT. The following controller code functionally selects a random matching pair of features and computes the translation between the two feature locations. Then it calls countInliers to count how many matches agree with the estimate translation of f1 to f2. Repeating this nRANSAC times and keeping track of the best estimate (by number of inliers) is a good estimation. But to further define the exact value beyond an integer pixel, I sent the result off to leastSquaresFit which performs a least squares fit to determine an even better estimate.
Counting inliers is fairly straightforward. For the given estimation, each possibility is checked against a threshold and a count is increased if the match is an inlier.
Performing a least Squares Fit here is simple because of the restricted degrees of freedom. The average translation vector yields the result.
Stitch and Crop the Resulting Aligned Image:
The basic outline here is to stitch the aligned images. Given warped images and their displacements, I figure out the max and min corners to determine the overall dimensions of the panorama. I also determine the absolute displacements of each image in the final panorama coordinate space given their relative displacements from each other. This is all accomplished in blendImages. Then I blend each image with its adjacent images in the panorama in AccumulateBlend and Normalize Blend. Here i use a feathering function that is simply a ratio equal to the slope of a line defined by the blendwidth. Technically this is a 1d version of a distance map. Normalizing the blend occurs after the feathering. Finally the resulting image is transformed using a an affine shear matrix, because in this case the only translation is in the y direction. Therefore a shear can revert things back to normal.
Determining the Absolute Bounds of the Panorama + Blending Images
Accumulating the Blend:
Here I blend each image with its adjacent images in the panorama using the feathering function outlined previously.
Normalizing the Blend:
Here I normalize the blend which can be accomplished by simply dividing each color by the weight calculated in accumulate blend.
What worked well and what did not?
While testing the affine matrix transformation, the best results were obtained by using a shear matrix to deform the image as it was only shifted in the y direction. What did not work well was attempting to use a vertical translation calculated by the x coordinate because it was extremely difficult to figure out how to store that in a matrix. The affine shear matrix accomplishes the exact same thing, but with a more mathematical intuition than trying to determine it geometrically. Also the RANSAC worked extremely well even before applying a leastSquares best fit. But no harm in making it even better.