Friday, March 30, 2012

StereoVision Discussion

Please post non-urgent questions and discussion of the StereoVision project here.
The project spec has been posted on the main course website.

36 comments:

  1. I see that the due date is Friday April 14th, but the 14 is actually a Saturday. I assume that you meant Friday the 14th?

    ReplyDelete
  2. In the spec, I am a little confused with the returning type of method toImage() in DepthMap class. Is it return only the BufferedImage type or some other type associating with BufferedImage?

    ReplyDelete
    Replies
    1. Yes, the deadline is lucky Friday the 13th.
      You are strongly encouraged to submit early...

      The signature for toImage is just
      BufferedImage toImage( );
      The spec looks pretty clear about it to me.
      Is the confusion that I said "methods" plural?
      That should probably be changed, since I only described one method.

      Delete
  3. Should the DepthMap and Eyes be JPanels, or should we just create JPanels in the MasterFrame and display the information from the DepthMap and Eyes?

    ReplyDelete
    Replies
    1. I was modeling the problem more like the recent lab where you created a JPanel subclass that could handle BufferedImage objects. I thought I would then override the paintComponent method in the class and use the graphics object passed to it to draw the image (again, like in the lab).

      tldr:
      - Make a class that extends JPanel that takes images and draws them.
      - Have your other classes deal with generating the images and then pass them to the JPanel
      - Add the JPanel to your JFrame.

      This is just how I was thinking of dealing with the problem, so take it with a grain of salt.

      Delete
    2. Yeah, that's how I was setting up, I just wasn't sure if we were supposed to do it a specific way or just do it how we wanted.

      Delete
    3. There does not seem to be any requirement for the DepthMap or Eye classes to be extensions of JPanel. Only time JPanel is mentioned in the spec:

      "MasterFrame will contain 4 JPanels displaying: (top left) the base image,
      (top right) the depthmap, rendered as a grayscale image, (bottom left) the left-eye
      view of the scene, and (bottom right) the right-eye view of the scene."

      So there are 4 JPanel objects, and they somehow display the required images. How you do that is up to you (I believe), although there might be better or worse ways to do it.

      Delete
    4. As Nathan said, you can do this as you like, but I didn't intend for Eye to be a kind of JPanel.

      Delete
  4. A far as z-buffering, is there any reason not to keep track of the hierarchy of pixels sorted by depth in DepthMap? It was recommended that: "You will want to make a helper class for keeping track of your pixels sorted from back to front by depth." However, it does not seem like the relative depths should ever be changing (since "each pixel is considered to lie in one of a family of parallel planes, each facing the user, but at different 'depths'.") so I do not understand the use of this recommended class as it seems to make sense to store the pixel depth hierarchy in the DepthMap itself.

    ReplyDelete
    Replies
    1. This is a good question. You *could* keep track of this in your DepthMap. On the other hand, this should only be done if you think of it as the defining functionality of a DepthMap. In other words, is this something you think *every* DepthMap must do? Otherwise, the functionality should be separated into a different class, which could be a subclass of DepthMap or not, depending on whether you think that every ZBuffer should also be a DepthMap.

      Remember, if B extends A, then every B is an A. The point of having multiple classes is to SEPARATE functionality that it makes sense to think of separately.

      Delete
  5. Is it necessary to pass the MasterFrame constructor the eyes, or could we just pass it an image and depthMap, and then create the eyes inside of the MasterFrame constructor?

    ReplyDelete
    Replies
    1. In my interpretation, there is no prohibition on adding more constructors to MasterFrame, only a requirement that you have a constructor with the signature given by the spec. It will be public, however, so you have to anticipate other people being able to call the required constructor and have your program work correctly.

      You could even use the required constructor in your theoretical one, something like:

      public MasterFrame(BufferedImage image, DepthMap depthMap){
      this(image, depthMap, new Eye( -3, 3), new Eye( 3 , 3))
      }

      If you need to mess with the Eye values further, you can then give your Eye class some methods that does this and call them after the "this()" statement or later on in the code (if that is what you were wanting).

      Delete
    2. Re Max's question, Nathan's comment is right on target.

      Delete
  6. To what degree is it acceptable to use code from the labs in our project?

    ReplyDelete
    Replies
    1. If you wrote the code, it's ok to use it. If it's recycled, e.g. from a lab, you must say so in a Javadoc comment(s). For instance, a line like " * This class was originally written for the Feb. 15th lab (CarCraft)"

      More generally, if you got some of your ideas from other people (students, TA's, the discussion website, etc.), you should acknowledge this help with comments at the appropriate places in your code. If in doubt, it is better to err on the side of going into more detail about your sources, rather than failing to give them the credit they deserve.

      Delete
  7. In the assignment it says: "Also provide a constructor that can produce a DepthMap from a grayscale BufferedImage."

    The way that this is phrased seems to imply that this is an overloaded constructor. I don't understand how the main/other constructor is supposed to work... would it just be passed an int array of depths?

    I think I may have missed the class where some of this was explicitly explained, but based on the conversation in the "Urgent" discussion thread, I presume that we don't actually have to generate the depth data?

    Either we pass it pre-generated depth data in the form of an array or BufferedImage - in which case it will properly do the right/left eye shifting - or we pass it an arbitrary grayscale BufferedImage which it will then erroneously pretend has valid depth data and end up doing a very wierd right/left eye shift?

    ReplyDelete
    Replies
    1. My interpretation of the "Also" was that the constructor requirement was in addition to the other requirements of the DepthMap class, not that there must be multiple constructors for DepthMap.

      I am working on the assumption that we do not have to generate a depth map image / information, and that we can use some preexisting image - depth map pairs. I don't, however, have any checks on what makes such a pair legit (nor do I know how to easily check this), so if you give the program garbage depth maps, there is no requirement that a shift based on this information should make any particular spatial sense.

      This is my interpretation of the spec and class discussion.

      Delete
  8. Yet another question ;-).

    Spec says: "You can use one call to setRGB to write an array of grayscale values into a BufferedImage." Fantastic. Only problem is that the setRGB method requires a width and height as parameters. If the DepthMap wasn't constructed with a BufferedImage (in which case toImage() is trivial), where do we get these numbers?

    Unless I'm mistaken, all we know is that width*height == array.length. We could cheat a bit by using the width & height of the baseImage since this *should* be the same size, but that seems like the wrong way to do it.

    ReplyDelete
    Replies
    1. I believe in class it was said that the depth map image should be the same size as the base image. I do not say anything explicit in the spec, though.

      As far as getting the width / height info into DepthMap when it its not constructed by a BufferedImage, you could add height and width to the non BufferedImage constructor. You will probably have to put checks on these dimensions (assuming the size of the depth map image needs to be the same as the base image) outside of the DepthMap, though.

      Also, I know you didn't ask about this, but toImage may be less trivial than you think, depending on how you handle people passing in BufferedImages that are not gray scale.

      Delete
    2. Thanks for responding Nathan.

      With regards to toImage(), the spec says "provide a constructor that can produce a DepthMap from a grayscale BufferedImage" so I think we can safely ignore people passing in colored BufferedImages.

      Even if this weren't the case, gray-scaling an image can be done pretty easily like this (maybe this will save someone some googling):

      BufferedImage image = new BufferedImage(width, height,
      BufferedImage.TYPE_BYTE_GRAY);
      Graphics g = image.getGraphics();
      g.drawImage(colorImage, 0, 0, null);
      g.dispose();

      On another note, when you construct a DepthMap array from the grayscale BufferedImage, are you just bit-shifting each pixel with the ~ operator? I'm pretty sure this is the correct thing to do, but when I test it by deliberately failing to invert the pixels back and displaying the inverted image it seems excessively black/dark. Maybe it's just my display?

      Delete
    3. This comment has been removed by the author.

      Delete
    4. This comment has been removed by the author.

      Delete
    5. Don't quite understand the question, but the ~ operator is not a shifting operator, but is the bitwise NOT / complement operator. The bitwise shifts are: <<, >> (sign extended), >>> (not sign extended).

      Delete
    6. Yeah, maybe I phrased it the wrong way. What the ~ does do is invert the colors by effectively subtracting each gray value from 255; which is necessary because white is supposed to correspond to depth 0 and black to max depth.

      Delete
    7. Figured out the problem: I was storing everything as ints instead of bytes.

      Delete
    8. Re Gabe's original question: a DepthMap object should be keeping track of its width and height. Any properly designed constructor must be receiving this information through its parameters. For instance, suppose we have a constructor with signature:
      public DepthMap( double[ ][ ] depthData );
      In this case, you could extract the width and height using depthData.length and depthData[0].length

      Delete
  9. If I don't follow the Deliverables section in the spec exactly will I be counted off (like if I don't have a class called MasterFrame for example)? My program would still do everything required of the assignment of course. Also can I use the ImageDisplay and ImagePanel classes from the last lab (with my own modifications of course) in my program?

    ReplyDelete
    Replies
    1. I'm not sure what you mean by "do everything required" but "not follow the spec exactly." That seems self-contradictory.

      Expect to lose points if you deviate from the spec, unless you have received permission from me in advance.

      See my comment in the "urgent" thread regarding code reuse from the lab assignments, and proper attributions.

      Delete
  10. When I use the getRGB method on a bufferedImage and assign the value to an int. the int is a large negative number; any ideas why?

    ReplyDelete
    Replies
    1. Yes, the highest 8 bits of the int store the "alpha channel" value. For this project, you want to just mask those out. For an explanation of what it means, check out:
      http://docs.oracle.com/javase/1.4.2/docs/api/java/awt/Color.html

      Delete
  11. "Shifting the Eye" is giving me a headache! I'm not handling occluding yet, I just want to get shifting of appropriate pixels (based on a normalized greyscale value of the image's corresponding pixel in the depthmap) to work. Here's my approach, which seems logical, but isn't working:

    Use getRGB() to store the image's pixel data in an int array. Use getRGB() to store the depthmap image's pixel data in an int array. I then parse through the depthmap pixel data and use bitshifting and masking to sum the rgb values, then divide by three, which gives a greyscale value from 0 to 255. I use these to generate a new int array of pixel data which represents the eye's view.

    To access pixel (x,y) in the array, I use the formula y*width + x (this is from the getRGB api). So, a for loop which iterates from 0 to width, with a nested for loop which iterates from 0 to height, allows me to access the pixel in each index of the image array and the corresponding depthmap value for that pixel. This is done from inside the child loop by plugging the loop counter variables into the above formula, and using this value as the index of the int arrays.

    Here's where I'm running to problems. I have an initialized int array which will store shifted pixels. I then use setRGB to generate a shifted eye BufferedImage. Inside the loop, I iterate through the image array to get its pixel and the greyscale value (depth). Now I use the depth value and the x and y offsets of the eye to calculate the "display location" of the current pixel based on the formula in the spec. This is the shifted index in the eye array which will be assigned the pixel data from the image array. So, assuming x and y offsets of 2 and a depth of 100, the pixel at (0,0), from the image array, gets put in the eye array at the shifted index (y +(2/100))*w + (x + (2/100)). If this were working, dealing with occlusion seems trivial. Simply sort the array of greyscale values from largest to smallest, tracking which indexes are mapped where, and the sort the image array indexes to match...

    What am I doing wrong!? The xOffset causes the image to shift left or right and wrap around. The yOffset just cuts off part of the image. I'm getting nonsensical results.

    ReplyDelete
    Replies
    1. The wrapping around is happening because you aren't checking whether the shifted x-values are still in the range [0 to w]. Consequently, horizontal shifts that should be moving the pixel out of the viewing area are instead making it show up a line or two lower.

      If you have offsets of 2,2 and a pixel whose depth is 100, the shift is going to be 0.02 pixels, which should round down to no shift at all. (A depth value of 100 means really far away--100 times as far as depth 1, which you should think of as the screen (with your retina at depth 0). For things to look interesting, your depth values should probably be a lot smaller, say in the range 1.0 to 10.0. This will require a rescaling from the [0 to 255] range you get out of the pixel values in an 8-bit grayscale image.

      Hope this helps!

      Delete
    2. Thanks for the prompt response! I didn't mention that I had some boundary checking in my post, but it only made sure the calculated index was in bounds. This fixed it, and thanks for the depth range tip for debugging. Gotta love the modulus operator...

      Delete
  12. So I've got my shifts to work for the most part, however, some of the pixels are shifting in seemingly random directions as you can see in the following image:
    http://i.imgur.com/tjxwV.png

    I have two 2D int arrays where the pixels coordinates are [x][y]. One array contains the RGB data from the base image, one contains the depth value.
    I've created a for loop that essentially does the following:
    for (int h = 255; h >= 1; h--){
    for (int i = 0; i < width; i++){
    for (int j = 0; j < height; j++){
    // compares the depth value at [i][j] to h; if they are the same, shift the pixel at [i][j].

    Is there any reason this loop create some of the pixel scattering seen in the image?

    ReplyDelete
    Replies
    1. I've got the same problem (but much different implementation) when I render this image. Bump...

      Delete
    2. @Brandon,

      You should avoid doing the triply-nested for loop, which will take 255*w*h iterations. Instead make a list of the pixels, sorted by their depths, and loop through this list once. This will only take w*h iterations, which will be a big savings, even if each iteration takes a little longer.

      I don't think the "pixel scattering" in your image is a bug. I think it is an artifact caused by two different phenomena:
      (1) Black pixels appear wherever one pixel happens to be shifted enough further than its neighbor to be rounded up to 1. This is bound to happen here and there.
      (2) Although the depthmap image LOOKS like it has sharp color boundaries between shapes, it has probably been anti-aliased, resulting in intermediate gray values along the border between two shapes. Since these correspond to intermediate shift values in our 3-D view, we see some colored pixels "floating" in the big gaps between near and far sections.
      Since these artifacts are created by (a) our model, and (b) the data in the given depth map, there is really no way to avoid them without going way beyond the assigned spec.

      Delete