In this project, you will implement a program to combine multiple
images into a panorama. The program has to automatically align the
input images by computing their relative motions and then blend
the resulting tile of images into a single seamless panorama.
Along the way, you will learn how to warp images into cylindrical
coordinates and compute translational motion between images using
a Gaussian pyramid. To start your project, you will be supplied
with some test images and skeleton code you can use as the basis
of your project, a sample solution executable you can use to
compare with your program.
The requirements of the assignment are:
Calibrate the camera and take images. (5pts)
Warp each input image into cylindrical coordinates and
output them. (10pts)
Manually assign initial translation to each image.
Compute the alignment (translation) for each image pair using
pyramid-based Lucas-Kanade motion estimation. (20pts)
Stitch and crop the resulting aligned images (this includes
blending and re-distribution of accumulated errors). (20pts)
View the panorama with a perspective viewer. (5pts)
The Lucas-Kanade motion estimation can be unstable for
different reasons. Try to make it robust. Explain what you did,
why you did what you did, and show what difference it made.
(10pts)
The skeleton code is written in Processing and the executable should run on any Java 1.6 platform. It is strongly recommended that you develop
your program based on this skeleton code by filling in the empty
functions.
You should refer to this page for information about how to
set up required libraries on TUX or your own machine.
NOTE! The version of ini4j given to you for project 1 was out of date. Please upgrade the ini4j.jar, or fully unpack the libraries package again.
You can develop on other platforms but please avoid any platform dependant code. You are not
allowed to use OpenCV or any other vision library for this
project except for calibrating your camera. All the core
implementation must be done on your own.
Use these images to test your code. The camera parameters (focal
length, radial distortion parameters (k_1, k_2)) for these images
are included in demo.ini. The initial translation values
assigned manually for generating an example panorama are written
in demo.ini Compare your results with what the
example program generates to check your code.
Your code should produce the panorama on the top for the test
images.
The skeleton code uses mtj, a
powerful linear algebra library that will be useful for the other
assignments as well. Take a look at the full
reference manual.
The skeleton code consists of the following functions. The
sentences in bold face describe what you will have to do and
implement.
Take/Load Images (loadImage)
Use a digital camera to take images. Plan
before you take the images. To get good results, you should
take images in manual mode so that the aperture and shutter
speed do not change (also make sure these are set properly),
keep the zoom-level at one level, probably at the wide-end,
so that the focal length does not change across the images,
keep your arms tight so that there's only horizontal
rotation while you spin around and take multiple images (you can
also use a tripod for better results),
and try to overlap at least 40% between two consecutive
images.
Then, you will need to estimate the camera parameters, which should be
the same for all images if you do not change the zoom-level while
taking the images.
As explained in the second week lecture, many free software exists
for non-linear camera calibration which will tell you all the
intrinsic and extrinsic camera parameters. This usually involves
taking multiple images of a known pattern, in most cases a
checkerboard pattern of known size, and establishing
correspondences for calibration. Try the calibration routines in
either Open
Computer Vision Library or Camera
Calibration Toolbox for Matlab. We recommend the latter as
we demonstrated in the lecture. Simply follow the example
described in "First
calibration example". A pdf for the calibration
checkerboard pattern can be found here.
Print out and use it (each square is 30mm by 30mm).
Note that mosaicing usually involves taking many wide-angle view
images (using the fully zoomed out end of you lens) in which
radial distortion can be visible. For getting good results, you
will need to account for this radial distortion as explained in
the lecture.
Remember that we do have a digital camera in the lab you can check
out. Also, we have a tripod you can check out. A tripod will be
necessary to get good results. Contact the TA.
Warp Images into Cylindrical Coordinates
(warpCylindrical)
Warp each image into cylindrical coordinates. For this you
will have to,
derive the inverse cylindrical coordinate transformation
based on what you learned in the lecture (see the Cylindrical
Reprojection slide),
Make sure you understand that input image coordinates to
cylindrical coordinates transformation involves,
normalization of image coordinates such that the focal
length is 1,
normalized image coordinates to 3D cylindrical coordinates
of radius 1,
3D cylindrical coordinates to 2D cylindrical (stheta, sh) coordinates.
Note that the cylinder radius is set to the normalized focal
length which is 1.
For cylindrical projection, you want to do inverse
warping and thus have to compute the inverse process of the
above. Remember you need to account for radial distortion.
use the derived inverse transformation to do inverse
warping to fill-in the pixel values of the cylindrical coordinate
image (recall the inverse warping you did in Project 1),
when doing the inverse warping, you will need some sort of
interpolation; you can use the bilinear interpolation you used
in Project 1 or implement higher-order/sophisticated
interpolation and claim it for extra credit (explain what you
did).
Assign Initial Alignment to Each Image
Once you output the cylindrical projections of the input images as
images, you can view them using any image viewer, e.g. gimp, and
manually assign the initial translation between each consecutive
image pairs. This information is necessary for the next step of
automatically aligning the images. Write down the initial
translation values in a file for later use. Remember that the
initial translation in (x,y) coordinates (which is in fact in the
cylindrical coordinates because you have already warped them into
cylindrical coordinates and rotation on a cylinder is translation
in cylindrical coordinates) has to be assigned for each
image pair. A suggestion is to follow the format in demo.ini
Compute the Alignment (Translation) for Each Image
Pair (lucasKanade)
Implement pyramid-based (coarse-to-fine) Lucas-Kanade
translational motion estimation. Note that the translation is
global for each individual image. In other words, you are
computing a single translation vector for each image to align it
to the previous image.
For this, you will need to fill in the functions
lucasKanade,lucasKanadeStep, and
getPyramid. getPyramid should construct an Gaussian pyramid of
specified number of levels for the given image. lucasKanade should take in two images (two warped
cylindrical images), initial translation values, number of
iterations of Lucas-Kanade motion estimation at each pyramid
level, and number of levels for the Gaussian pyramid. Then, it must
call getPyramid to construct the Gaussian pyramid
of specified number of levels for each image,
loop through the pyramids from coarse to fine in which you
call lucasKanade_step for the specified number
of times (iterative Lucas-Kanade),
and updates the translation vector accordingly (note that the
translation must be scaled from one pyramid level to another),
and finally output the translation vector for the given
image pair in the original (the finest) resolution.
You have to implement lucasKanade_step (which is called
from lucasKanade, such that it takes takes two images
image1, image2, and the initial translation vector (u,v) as input,
and computes an updated translation (u',v') = (u+du,v+dv) which
minimizes |image2(x+u',y+v')-image1(x,y)| over all x, y. Note that,
instead of evaluating the image derivatives (Ix, Iy, and It)
between image2(x+u,y+v) and image1(x,y), it should first create image2t, a
warp of image2 using the translation (u,v), then compute the
image derivatives between image2t(x,y) and image1(x,y). This can be
implemented by
translating the image to be aligned with the given
(current) translation vector,
compute the intensity gradients,
accumulate the 2x2 matrix and 2x1 vector,
solving the 2x2 system and updating the translation
estimate
To get accurate results, don't forget to discard black pixels
(undefined pixels) that result from cylindrical projection.
(Within iterative Lucas-Kanade, you will frequently have to
update your image with the current translation vector. You can
use translateImage for this purpose if you find it
useful. Be sure you fully understand how it works before calling it.)
Stitch and Crop the Aligned Image (makePanorama)
In makePanorama, implement the following. From the
warped images and their relative displacements, figure out how
large the final stitched image will be and their absolute
displacements in the panorama (you'll need to successively apply
the estimated translation vectors). For this, it would be easier
if you have saved the computed translation vectors in a file in
the previous step. (The demo application outputs Lucas-Kanade-Output.ini to be read by this step.)
Then, resample each image to its final location and
blend it with its neighbors by calling
blendImages.
As in Project 1, first try simple linear blending and then, for
extra credit, try other blending functions or figure out some way
to compensate for exposure differences if you see strong artifacts
of them.
Finally, in makePanorama crop the resulting image to make
the left and right edges seam perfectly. The horizontal extent can
be computed in the previous blending routine since the first image
occurs at both the left and right end of the stitched sequence
(draw the cut line somewhere in this image). Before cropping,
use a linear warp to the mosaic to remove any vertical drift
between the first and last image. This warp, of the form y' = y +
ax, should transform the y coordinates of the mosaic such that the
first image has the same y-coordinate on both the left and right
end. Calculate the value of 'a' needed to perform this
transformation. Once you have 'a', do inverse warping from the
final image size (width should be the same as you had after
blending the images and the height should be the height of the
first column) to compute the final panorama. You can also try
other alternatives explained in the class. Remember to mention
what you've done in a README file as well as the html page you
make for artifacts.
View the Panorama (ViewPanorama)
Download ShowPanorama, a Processing panorama viewer and use it to view your panorama.
The minimal settings in an HTML file would be:
View your panorama in a web page (use it in your result web
page).
Don't forget to put the jar in the same directory. The code is here if you're interested.
Here's an example panorama computed from the test images.
Main (draw)
Implement draw such that the final executable reads an ini file and executes the full process of creating a panorama.
For example, the following ini file will generate the example panorama:
[Camera Parameters]
; These are the parameters for the camera that took the demo images
focal_length = 717
radial_distortion = -0.06462
radial_distortion = -0.38987
[Images]
; List the images here like
; image = frame01.png
; image = frame02.png
image = DSC00097.png
image = DSC00098.png
image = DSC00099.png
image = DSC00100.png
image = DSC00101.png
image = DSC00102.png
image = DSC00103.png
image = DSC00104.png
image = DSC00105.png
image = DSC00106.png
image = DSC00107.png
image = DSC00108.png
image = DSC00109.png
image = DSC00110.png
image = DSC00111.png
[Lucas-Kanade]
; Lucas-Kanade options
levels = 3
steps = 4
[Panorama]
; List the pairs here in the format
; pair = [image id] [image id] [horizontal offset] [vertical offset]
; for example:
; pair = 0 14 180 10
; pair = 14 13 120 -5
; ....
; pair = 01 00 200 0
;
; Remember that Lucas-Kanade estimates the movement of the 1st image
pair = 00 14 240 0
pair = 14 13 300 0
pair = 13 12 368 0
pair = 12 11 316 0
pair = 11 10 324 0
pair = 10 09 297 0
pair = 09 08 300 0
pair = 08 07 335 0
pair = 07 06 306 0
pair = 06 05 296 0
pair = 05 04 268 0
pair = 04 03 276 0
pair = 03 02 277 0
pair = 02 01 252 0
pair = 01 00 232 0
Specifically, these parts should operate in order
Cylindrical warping: set the [Camera Parameters] section and [Images] section
Note that focal_length has to be in pixels.
Each file in the [Images] section should be warped and a prefix (for example "WARPED_") should be prepended to the filename.
Aligning an image pair: set the [Panorama] section and [Lucas-Kanade] section
In the Panorama section you specify the ordered list of image pairs and their relative movement.
The first value is the first image index (from the [Original Images] list.
The second value is the second image index. Remember that the Lucas-Kanade algorithm estimates the movement of the first image. This is why the image pairs are specified in reverse order.
The second two values are the horizontal and vertical movemont guess (for the warped images).
Note to test your algorithm you should not put precise guesses.
In the [Lucas-Kanade] section you specify the number of levels in the Gaussian pyramid, and the number of iterations to run at each level.
Note to test your algorithm you should give a high value for the number of iterations. The algorithm should converge.
Run the previous step for all adjacent pairs of images and
save the output into a separate file that has the same format as the [Panorama] section of your ini file.
The sample application creates a file called "Lucas-Kanade-Output.ini."
The only difference will be that the translation values are those generated by the algorithm.
Stich and crop to make a panorama:
The program must read in the "Lucas-Kanade-Output.ini" file and blend the images together, correct for any drifting errors, and crop the image so that the left and right edges align. It should then save the output.
Please email a link to a final webpage before the deadline. The website must include the following
These links:
The main pde file (Mosaicing.pde).
If you added files, explain and provide links to the additional files.
A link to the 'data' directory which should contain your images and your various ini files. (Be sure the directory will list or provide a link to a zip.)
At least three panoramas: one should be a panorama
computed from the provided test images. At least one of the
rest has to be taken with a hand-held camera. You can submit
as many panoramas as you want. Please specify which you
liked the best. For each example, show the original images
and the resulting panorama as well as the panorama within
the panorama viewer as explained above. Explain your
artifact in text beneath each example. Each of these images
on the web page should be in relatively low-resolution which
is linked to the full-resolution image.
A short description of what worked well and what
didn't. If you tried several variants or did something
non-standard, please describe this as well. If you did the
items in the extra credit list, clearly state which one you
did and how you did it.
Correct brightness change. Unless you have perfect
control over the exposure setting of your camera, the input
images can have different exposure resulting in brightness
fluctuation in the final mosaic. Come up with a way to get rid
of this and implement it. Demonstrate it on input images with
exposure differences and explain in detail how you got rid of
it. (5pts)
Cube or spherical panoramas. Use a cube or a sphere
instead of a cylinder. Note that you'll need to derive the right
mapping. (5pts each)
Panorama with moving object without ghosting. Take
input images of a scene with moving objects such as cars. Moving
objects will appear multiple times at different locations in the
images and will cause significant ghosting in the final mosaic.
Try to get rid of this. Explain in detail how you got rid of it.
(5pts)
Implement Laplacian Blending. Read Burt and Adelson's paper and
implement it. This shouldn't be as difficult as it sounds once
you have the Gaussian pyramid code running. (5pts)