|
Feature Detection, Description and Matching
|
|
Project Overview
The idea for this project was to detect discriminating features in an image and find the best matching features in other images. The detected features are invariant to basic transformations including translation, rotation, illumination and scale.
Feature Detection
It can be rather difficult to describe qualitatively what makes a good feature worthy of detection in a computer vision project. One of the simplest methods is by corner detection. Most gray scale photos can be thought of as combination of uniform patches, lines and corners. Between those three features, the corner is by far the most detectable and unique. In a uniform patch, pixels adjacent to a single pixel do not vary much. In a line, the variation will only be in one direction. With a corner however, shifting the feature in any direction should pose a change in the intensity of the adjacent pixels. Selecting pixels that present the maximum change when shifted in any direction is the base of a very basic feature detection scheme.
To identify points of interest in the image, I will use the Harris corner detection method. That is to say, for each point in the image I consider a window of pixels around that point. I then compute the Harris matrix H for the point by summing the gradient of the pixel (as defined by a basic Sobel matrix) while including a weighting scheme defined by a gaussian filter. The weights of the gaussian filter were chosen to be circularly symmetric to account for rotation variance. To find interest points I define a corner strength function c(H) = determinant(H)/trace(H) and normalize by the global maximum in the photo. Once this corner strength value has been computed for every pixel, I choose points where c is above threshold and also where c is a local maximum among surrounding pixels.
To identify points of interest in the image, I will use the Harris corner detection method. That is to say, for each point in the image I consider a window of pixels around that point. I then compute the Harris matrix H for the point by summing the gradient of the pixel (as defined by a basic Sobel matrix) while including a weighting scheme defined by a gaussian filter. The weights of the gaussian filter were chosen to be circularly symmetric to account for rotation variance. To find interest points I define a corner strength function c(H) = determinant(H)/trace(H) and normalize by the global maximum in the photo. Once this corner strength value has been computed for every pixel, I choose points where c is above threshold and also where c is a local maximum among surrounding pixels.
Feature Detector Overview:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | void ComputeHarrisFeatures(CFloatImage &image, FeatureSet &features) { //Create grayscale image used for Harris detection CFloatImage grayImage=ConvertToGray(image); //Create image to store Harris values CFloatImage harrisImage(image.Shape().width,image.Shape().height,1); //Create image to store local maximum harris values as 1, other pixels 0 CByteImage harrisMaxImage(image.Shape().width,image.Shape().height,1); //compute Harris values puts harris values at each pixel position in harrisImage. //You'll need to implement this function. computeHarrisValues(grayImage, harrisImage); // Threshold the harris image and compute local maxima. You'll need to implement this function. computeLocalMaxima(harrisImage,harrisMaxImage); // Prints out the harris image for debugging purposes CByteImage tmp(harrisImage.Shape()); convertToByteImage(harrisImage, tmp); WriteFile(tmp, "harris.tga"); //will then proceed to describe each feature... see feature descriptor } |
Computing Harris Values:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | /*Loop through the image to compute the harris corner values srcImage: grayscale of original image harrisImage: populate the harris values per pixel in this image*/ void computeHarrisValues(CFloatImage &srcImage, CFloatImage &harrisImage) { //I think everything up to calculating c(H) will be done in this method //I think that the harrisImage is a 1 channel image (will appear as grayscale) and i will just populate that value with the c(H) value of that pixel int w = srcImage.Shape().width; int h = srcImage.Shape().height; //first compute Ix, Iy and the 2x2 matrix comprised of them CFloatImage gradientMatrices(srcImage.Shape().width, srcImage.Shape().height,4); computeGradientMatrices(srcImage, gradientMatrices); float c_H_max = 0; //for every pixel (not at the boarder) for (int y = 2; y < h - 2; y++) { for (int x = 2; x < w - 2; x++) { //use a nxn window around the pixel and multiply each 2x2 matrix from gradientMatrices for that pixel by the corresponding (by index) weight from the gaussian filter (nxn) //sum over all the 2 dimmensional matrices in that window float H_matrix[4] = { 0, 0, 0, 0 }; //assume gaussian window is 5x5 int startY = y - 2; int startX = x - 2; for (int i = startY; i < y + 3; i++){ for (int j = startX; j < x + 3; j++){ int matchingGaussianIndex = 5 * (i - y + 2) + (j - x + 2); H_matrix[0] += gradientMatrices.Pixel(j, i, 0)*gaussian5x5[matchingGaussianIndex]; H_matrix[1] += gradientMatrices.Pixel(j, i, 1)*gaussian5x5[matchingGaussianIndex]; H_matrix[2] += gradientMatrices.Pixel(j, i, 2)*gaussian5x5[matchingGaussianIndex]; H_matrix[3] += gradientMatrices.Pixel(j, i, 3)*gaussian5x5[matchingGaussianIndex]; } } //calculate c(H) = det(H)/trace(H) float c_H = (H_matrix[0] * H_matrix[3] - H_matrix[2] * H_matrix[1]) / (H_matrix[0] + H_matrix[3]); harrisImage.Pixel(x, y, 0) = c_H; if (c_H > c_H_max){ //keep track of the maximum c_H to help normalize later c_H_max = c_H; } } } //normalize the intensity for (int y = 2; y < h - 2; y++) { for (int x = 2; x < w - 2; x++) { harrisImage.Pixel(x, y, 0) = harrisImage.Pixel(x, y, 0) / c_H_max; } } } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | /*computes Ix^2, IxIy; IyIx, Iy^2... the intermediate gradient matrix used in defining H*/ void computeGradientMatrices(CFloatImage &srcImage, CFloatImage &gradientMatrices) { int w = srcImage.Shape().width; int h = srcImage.Shape().height; //for every pixel thats not on the boarders for (int y = 2; y < h-2; y++) { for (int x = 2; x < w-2; x++) { //for a box 5x5 centered around that pixel float Ix = 0; float Iy = 0; int startY = y - 2; int startX = x - 2; for (int i = startY; i < y + 3; i++){ for (int j = startX; j < x + 3; j++){ int matchingSobelIndex = 5 * (i - y + 2) + (j - x + 2); Ix += srcImage.Pixel(j, i, 0)*sobelX[matchingSobelIndex]; Iy += srcImage.Pixel(j, i, 0)*sobelY[matchingSobelIndex]; } } gradientMatrices.Pixel(x, y, 0) = Ix*Ix; gradientMatrices.Pixel(x, y, 1) = Ix*Iy; gradientMatrices.Pixel(x, y, 2) = Iy*Ix; gradientMatrices.Pixel(x, y, 3) = Iy*Iy; } } } |
Computing Local Maxima:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | /*Loop through the harrisImage to threshold and compute the local maxima in a neighborhood srcImage: image with Harris values destImage: Assign 1 to a pixel if it is above a threshold and is the local maximum in 3x3 window, 0 otherwise.*/ void computeLocalMaxima(CFloatImage &srcImage, CByteImage &destImage) { for (int y = 0; y < destImage.Shape().height; y++) { for (int x = 0; x < destImage.Shape().width; x++) { destImage.Pixel(x, y, 0) = 0; } } float strengthThreshold = 0.017; int w = srcImage.Shape().width; int h = srcImage.Shape().height; float currFeatureStrength = 0; float featureStrengthLocalMax = 0; float comparedPixelFeatureStrength = 0; /*by excluding features that are at pixels within 15pixels of the edges, we avoid boundary conditions with the rotation 20x20 grid used in interpolation*/ /*for every pixel (not within 15 pixels of the boarder)*/ for (int y = 15; y < h - 15; y++) { for (int x = 15; x < w - 15; x++) { //check corner strength value of that pixel //if its above threshold, check to see if it is a local max in a 3x3 window currFeatureStrength = srcImage.Pixel(x, y, 0); if (currFeatureStrength >= strengthThreshold){ //check that the pixel is a local maxima featureStrengthLocalMax = 0; for (int i = x - 1; i < x + 2; i++){ for (int j = y - 1; j < y + 2; j++){ //determine if pixel x,y is the local max in a 3x3 window if (i == x && j == y){ //do nothing } else{ comparedPixelFeatureStrength = srcImage.Pixel(i, j, 0); if (comparedPixelFeatureStrength > featureStrengthLocalMax){ featureStrengthLocalMax = comparedPixelFeatureStrength; } } } } //if x,y is a local max, set that pixel to 1 and all other adjacent pixels default to 0 if (currFeatureStrength > featureStrengthLocalMax){ for (int i = x - 1; i < x + 2; i++){ for (int j = y - 1; j < y + 2; j++){ if (i == x && j == y){ destImage.Pixel(x, y, 0) = 1; } else{ destImage.Pixel(i, j, 0) = 0; } } } } //possible optimization (skip column if the feature is a local max) } } } } |
Performance:
To demonstrate the operation of the original feature detector, two example images have been included with their resulting detected features. To discern the feature pixels, one might need to turn the brightness up on their display... as they can be hard to see.
Feature Description (Using Image Pyramids)
After unique features have been detected/identified, it is necessary to describe these features thoroughly and efficiently so that a matching algorithm can easily compare the descriptions of two features from different images. A good feature descriptor must take into account changes in position, orientation and illumination.
See my Feature Descriptor Here
See my Feature Descriptor Here
Feature Matching
Once the noteworthy features in an image have been described, it is trivial to compare the descriptions to features from other images and define a match strength between features in the photos.
See my Feature Matching Here
See my Feature Matching Here