02-26-2016 12:26 AM
Hello,
I have been intruiged by the video stabilization done by YouTube and wanted to implement it on matlab. I tracked down two papers related to it, this being them,
1.) http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37041.pdf
2.) http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6738007&url=http%3A%2F%2Fieeexplore.ieee.org%2... (L1 L2 optimization for video stabilization)
I've made some progress, but sadly all I can get is a skewed video. 😞 In working on this problem I encountered a lot of new things which I'm also trying to learn, but as I have no background in them, I'm facing a lot of difficulty.
Is someone here well versed with such topics (Convex Optimization, CVX modeling in matlab, Homographies etc), if so, I would love if you could help me in solving this problem.
I'm unable to get a proper understanding of the paper, I feel (?) that they may have omitted some details, but I'm not sure.
Thanks.
02-26-2016 12:55 AM
02-26-2016 03:20 AM
Hello hatef,
Thanks for replying. Yes I agree, this is a labview centric forum. But you see, I'm (slightly) desperate for help (this problem has been on my mind for weeks). The only reason I chose to post here is because I wanted to gain insight from people working directly in the vision field. Because what is the main thing in this stuff is not the language or modeling platform but rather an understanding of the core algorithm, which I'm not fully able to grasp.
If you want to hear it (you possible wont) but I'll post anyways (sorry) or if someone else is interested, hereis my interpretation of it. Feel free to power punch my answer to oblivion (that is why I post here) so I may at least gain some insight.
Features (any kind SIFT/SURF/FAST etc) are tracked across the frames to get the per-frame-pair affine transformation matrices. These matrices are multiplied. At any instant this product is called the Camera Path. A convex optimization problem is solved using these transforms and the camera path to obtain an update transform. This is used to modify the camera path to warp the input frame to a new frame of reference such that the video appears smooth. (Imagine a train passing by you very fast on a platform, you look through the windows make out a shaky blur instead of the passengers, now imagine being a passenger of the train, you can easily make out the everything inside the train)
I've also tried this by directly warping (LK warp) without tracking any features and still nothing. Along with this other things I've tried are, Optimizing dynamically over every frame, summing up the objective whole function and then optimizing over each frame etc.
If anyone could provide any insight or their understanding of the algo it would be a great help.