6869 advances in computer vision learning and interfaces 1.pdf-道客多多

资源描述

1、6.869 Advances in Computer Vision: Learning and Interfaces 1Problem Set 1Assigned: 02/15/2005Due: 02/24/2005Problem 1 Camera Calibration(a) Original image. (b) Detected features.Figure 1: Input image and detected features.The goal of this problem is to implement a linear calibration algorithm inMATL

2、AB based on the method described in Section 3.2 (Forsyth and Ponce).Provided an input image, we want to extract the intrinsic (focal length andcenter of image) and extrinsic (rotation and translation) parameters of thecamera used to grab this image. We assume no radial distortion.A typical way to ca

3、librate a camera is to take a picture of a calibrationobject, flnd 2D features in the picture and derive the calibration from the2D features and their corresponding positions in 3D. In our case, we usea 2m-wide cube as calibration object textured with a checkerboard pattern(see Figure 1(a). We searc

4、h in the image (of size 600x600) for the 2Dfeatures corresponding to corners of the checkerboard. Figure 1(b) showsthese features.Since we know its exact size (2m), we can flnd the exact 3D position ofeach 2D feature relative to the center of the cube. This process of flndingcorrespondences is simpl

5、e but time consuming. We did this part of thework for you. Features2D.mat and Features3D.mat contain the 2D cornerfeatures and the corresponding 3D positions.6.869 Advances in Computer Vision: Learning and Interfaces 2(a) Your flrst task is to write a MATLAB function which takes these twolists as in

6、put and returns the calibration parameters as output. Yourfunction should have the following syntax:function fi;fl; ;u0;v0;R;t = calibrate(f2D;f3D)where fi and fl are the horizontal and vertical scale factors of the cameraCCD (in pixels), is the camera skew (in radians), u0 and v0 arethe center of t

7、he image (in pixels), and R t is the relative rigidtransformation between the center of the cube and the camera. Alsocompute the camera focal length, f, in meters. The size of the cameraCCD is (1 x 1) square inches and its pixels are square (i.e. fi = fl).(b) To check your solution create the functi

8、on,function f2D = pcamview(f3D;fi;fl; ;u0;v0;R;t)where f2D are the 2D projections of the points f3D, and the cam-era intrinsic and extrinsic parameters are as deflned above. Usingthis function project the provided 3D features onto the camera fo-cal plane. Check your solution from part (a) by superim

9、posing theprojected points onto the original image checkercube.bmp.Please submit the flles calibration-last name.m and pcamview-last name.m by email to 6869-submitcsail.MIT.EDU. In your problem set so-lution, write the values (fi, fl, , f, u0, v0, R and t) you found usingthe cube data set. Also incl

10、ude the result image of part (b) and a printout ofyour code.Problem 2 Image PyramidsThis problem uses pyramid image processing. Download and install the mat-labPyrTools from http:/s.nyu.edu/eero/software.html. Whenforming pyramid decompositions for this problems, you may always use thedefault decomp

11、osition fllters. For this problem submit your MATLAB codeand include a printout.Subjectively, our visual world appears to us to be high resolution every-where. However, we have much higher spatial resolution in the center of our6.869 Advances in Computer Vision: Learning and Interfaces 3(a) (b)Figur

12、e 2: (a) Measures of acuity. (b) Plots of eccentricity versus accuity.fleld of view than in the periphery. In this problem, we will synthesize animage approximating our visual resolution as a function of eccentricity.Figure 2 shows a plot of the minimum angle resolvable as a function ofthe visual ec

13、centricity. The visual eccentricity is measured in degrees awayfrom the center of flxation. (From Rodieck, “The First Steps in Seeing“,Sinauer, 1998).Approximate acuity, a, in minutes of arc (60 minutes to a degree) as afunction of eccentricity, e, in degrees, by the expression,a = 0:23e+0:7: (1)We

14、will create an image with the efiective spacing of the pixels equal to theangular size of the acuity limit. In the flgure, that limit is deflned as thewhite space between two ends of a circle. Adjacent black, white, black pixelscould approximately represent that circle opening if the pixel spacing w

15、ereequal to the angular size of the acuity limit.Assume that the image (or monitor) is square, and that you view it from adistance of two times the length of one side of the image. Where convenient,you may assume angles are small enough so that tan( ) .(a) How many evenly spaced pixels per side does

16、 the image need to havein order that the highest resolution part of the image has one pixel perlength of flnest acuity? Assume that the highest resolution image pointlies at half the maximum acuity, where maximum acuity is as specifledby (1).(b) Lettheupperleftcorneroftheimagebe(0,0), andtherightand

17、bottomedges of the picture be at a distance 1 from this corner. Assume that6.869 Advances in Computer Vision: Learning and Interfaces 4the upper left corner is the center of flxation. What efiective pixelspacing, as a function of these units, causes the pixel spacing to equalthe spatial acuity for t

18、he corresponding eccentricity?(c) We can approximate images of this resolution by using a Gaussianpyramid, which generates images at difierent numbers of pixel samples,dividing the number of pixels by two at each level of the Gaussianpyramid. Start from an image at the full resolution of part (a). E

19、achpyramid level increases the efiective size of its pixels by a factor of twoin each dimension. As a function of the coordinate system used in (b),by how many factors of two should the resolution of the original imagebe reduced as a function of position in the image in order to simulatethe human vi

20、sual acuity, assuming the viewer stares at the upper leftcorner of the image?(d) Theexpressionin(c)involvesfractionalpyramidlevels. Wecanvisuallyapproximate images at those intermediate resolution levels by linearlyinterpolating between our Gaussian pyramid levels. On the class website is a 2000x200

21、0 image, which should be more than enough pixelsfor you. Crop that image to the desired resolution such that the up-per left corner will be at half the maximum visual acuity, when viewedfrom 2 picture lengths away. Use the Gaussian pyramid to create animage that simulates the fall-ofi in visual acui

22、ty, assume the flxationpoint is at the upper left corner. At any given pixel, determine thecoecients for interpolating between images by linearly interpolatingthe corresponding pixel dimensions.Hint: You will want to use the upBlur function to transform the Gaus-sian pyramid levels to all have the s

23、ame number of pixels. Assume thata pyramid level after upBlur has efiectively the same number of pixels(in terms of picture content) as the original pyramid band before theupBlur operation. That is a reasonable approximation (take 6.341 forthe details that were glossing over here).Problem 3 Texture

24、SynthesisIn this problem you will implement the Efros and Leung algorithm fortexture synthesis discussed in Section 9.3 of Forsyth and Ponce. In addition6.869 Advances in Computer Vision: Learning and Interfaces 5toreadingthetextbookyoumayalsoflndithelpfultovisitEfrostexturesyn-thesiswebsite: http:/

25、www.cs.berkeley.edu/efros/research/synthesis.html, from which many of the implementation details described below canbe found.As discussed in class, the Efros and Leung algorithm synthesizes a newtexture by performing an exhaustive search of a source texture for each syn-thesized pixel in the target

26、image, in which sum-of-squared difierences (SSD)is used to associate similar image patches in the source image with that ofthe target. The algorithm is initialized by randomly selecting a 3x3 patchfrom the source texture and placing it in the center of the target texture.The boundaries of this patch

27、 are then recursively fllled until all pixels in thetarget image have been considered.Implement the Efros and Leung algorithm as the following MATLABfunction:synthIm = SynthTexture(sample;w;s)where sample isthesourcetextureimage, w isthewidthofthesearchwindow,and s=ht wt specifles the height and wid

28、th of the target image synthIm.As described above, this algorithm will create a new target texture image,initialized with a 3x3 patch from the source image. It will then grow thispatch to flll the entire image. As discussed in the textbook, when growingthe image un-fllled pixels along the boundary o

29、f the block of synthesized val-ues are considered at each iteration of the algorithm. A useful technique forrecovering the location of these pixels in MATLAB is using dilation, a mor-phological operation that expands image regions (it performs the oppositefunction of the erode operation from the pre

30、vious problem set). Use MAT-LABs imdilate and find routines to recover the un-fllled pixel locationsalong the boundary of the synthesized block in the target image.In addition to the above function we ask you to write a subroutine that fora given pixel in the target image, returns a list of possible

31、 candidate matchesin the source texture along with their corresponding SSD errors. We ask thisfunction to have the following syntax:bestMatches;errors = FindMatches(template;sample;G)where bestMatches is the list of possible candidate matches with correspond-ing SSD errors specifled by errors. templ

32、ate is the w w image templateassociated with a pixel of the target image, sample is the source texture im-age, and G is a 2D Gaussian mask discussed below. This routine is called by6.869 Advances in Computer Vision: Learning and Interfaces 6SynthTexture and a pixel value is randomly selected from be

33、stMatches tosynthesize a pixel of the target image. To form bestMatches accept all pixellocations whose SSD error values are less than the minimum SSD value times(1+). To avoid randomly selecting a match with unusually large error, alsocheck that the error of the randomly selected match is below a t

34、hreshold .Efros and Leung use threshold values of = 0:1 and = 0:3.Note template can have values that have not yet been fllled in by theimage growing routine. Mask the template image such that these valuesare not considered when computing SSD. Efros and Leung suggest using thefollowing image mask:Mas

35、k = G :validMaskwhere validMask is a square mask of width w that is 1 where template isfllled, 0 otherwise and G is a 2D zero-mean Gaussian with variance = w=6:4sampled on a w w grid centered about its mean. G can be pre-computedusing MATLABs fspecial routine. The purpose of the Gaussian is to down-

36、weight pixels that are farther from the center of the template. Also, makesure to normalize the mask such that its elements sum to 1.Test and run your implementation using the grayscale source texture im-age rings.jpg, with window widths of w = 5;7;13, s=100 100 and aninitial starting seed of (x;y)

37、= (4;32). Explain the algorithms performancewith respect to window size. For a given window size, if you re-run the algo-rithm with the same starting seed do you get the same result? Why or whynot? Is this true for all window sizes?Please include the synthesized textures that correspond to each wind

38、ow sizealong with answers to the above questions and a printout of your code in yourwriteup. Also, submit the flles synthtexture-last name.m and findmatches-last name.m by email to 6869-submitcsail.MIT.EDU.Problem 4 Color Representation and MatchingThe Matlab data flle CIE.mat contains the spectra w

39、hich will be neededfor this problem. The vectors cx, cy, and cz give the CIE color matchingfunctions for the X, Y, and Z color coordinates, respectively (these are alsoplotted in Figure 6.7 of Forsyth and Ponce). These functions are sampled at5 nm intervals for wavelengths from 400 through 700 nm.6.

40、869 Advances in Computer Vision: Learning and Interfaces 7(a) What are the CIE X, Y, and Z coordinates of the light correspondingto the power spectrum in the vector sA? What are the normalized CIEcoordinates, x and y?(b) Suppose you have a test color specifled as a linear combination of pri-maries a

41、t 450, 550, and 650 nm, given by (a;b;c)T, respectively. Whatlinear combination of light sources at 460, 510, and 590 nm is need tomatch the test color?(c) We want to specify one possible spectral power distribution of the non-physical primary lights of the CIE color coordinate system, correspond-in

42、g to the color matching functions cx, cy, cz. Suppose you insist thatthe spectral power of the CIE primaries be zero for all wavelengths ex-cept at 460, 510, and 640 nm. Specify the linear combination of lightsat 460, 510, and 640 nm that corresponds to the primary lights of theCIE color coordinate system.

展开阅读全文