A Conditional Adversarial Network for Scene Flow Estimation

The problem of Scene flow estimation in depth videos has been attracting attention of researchers of robot vision, due to its potential application in various areas of robotics. The conventional scene flow methods are difficult to use in reallife app…

Authors: Ravi Kumar Thakur, Snehasis Mukherjee

A Conditional Adversarial Network for Scene Flow Estimation
A Conditional Adv ersarial Network for Scene Flo w Estimation Ravi Kumar Thakur and Snehasis Mukherjee Abstract —The problem of Scene flo w estimation in depth videos has been attracting attention of resear chers of r obot vision, due to its potential application in various ar eas of robotics. The con ventional scene flow methods are difficult to use in real- life applications due to their long computational overhead. W e propose a conditional adversarial network SceneFlowGAN for scene flow estimation. The proposed SceneFlowGAN uses loss function at two ends: both generator and descriptor ends. The proposed netw ork is the first attempt to estimate scene flow using generative adversarial networks, and is able to estimate both the optical flow and disparity fr om the input stereo images simultaneously . The proposed method is experimented on a huge RGB-D benchmark sceneflow dataset. I . I N T R O D UC T I O N Scene flo w is a three dimensional (3D) motion field repre- sentation of points moving in the 3D space. Scene flo w gives the complete information about the motion and geometry in a stereo pair of frames in 3D space, of all the visible scene points in the frames. Thus, the estimation of the flow field is an important task in 3D computer vision and robot vision. The work on motion estimation has been done earlier for rigid scenes. Howe v er , the problem of scene flow estimation started gaining attention when scene flow was first introduced for dynamic scenes [23]. This complete understanding of dynamic scene can be used in many application areas of computer vision including activity recognition, 3D reconstruction, au- tonomous navig ation, free-viewpoint video, motion capture system, augmented reality , and structure from motion. The problem of scene flow estimation can be considered as 3D counterpart of optical flo w estimation. Despite of sev eral efforts, estimation of scene flow is still an under-determined problem. Sev eral approaches for scene flow estimation have been proposed since its introduction, where most of the approaches rely on con ventional procedures of computer vision [7]. Some scene flow estimation methods extend the popular optical flow estimation techniques to 3D by introducing disparity map, for estimating scene flo w [26]. While other approaches are based on optimization of energy function and v ariational methods [19]. Most of the scene flo w estimation methods rely on the calculation of optical flow alongwith depth for estimating the scene flow [11]. The fun- damental assumptions behind the state-of-the-art algorithms are brightness and gradient constancy of the stereo images. As a result, most of the methods work very well on scenes with small displacement, but can not perform well on large displacement samples. In realistic scenario, these assumptions are violated often. Figure 1 shows such an example from our prediction results. Fig. 1. An example of stereo pair alongwith the Ground truth and predicted disparities, optical flow . First row shows a sample stereo pair, second row shows the ground truth scene flow and the third ro w shows the reconstructed scene flow along x-, y- and z-directions. The problem of scene flow estimation using deep networks has recently attracted the attention of computer vision research community with av ailability of lar ge scale datasets [16]. Nearly all the classical methods take se veral minutes to process a frame. Hence, the computational time does not permit real time application. Recently , learning based methods [18, 16] hav e been proposed due to av ailability of large scale dataset with ground truth. These methods take more time for training, but are able to reduce the run time to less than a second. Though in terms of accuracy , this learning based approaches can be currently not on par with the classical methods. W ith the training in synthetic dataset the scene flo w estimation may not work on naturalistic scenes. Estimating scene flow is a challenging problem because of the dependency of the estimation algorithms on the assump- tions of brightness and gradient constancy across subsequent stereo frames. Even an occlusion in the image can affect the stereo correspondence between the frames. V arying illumina- tion and lack of texture information can also giv e erroneous information about the brightness pattern. Similarly , large dis- placement can also cause error in scene flow computation. The deep networks are good in understanding highly abstract features. The automatic feature extraction capability of the deep networks can be used to de velop more robust model for the cases where assumptions are violated. There are a few learning based methods for scene flow estimation. This can be useful in semi-supervised learning sce- nario as well, since acquiring data will remain a challenge. W e propose a conditional adversarial network for estimating scene flow from stereo images obtained at different time instances. T o our knowledge this is the first attempt on scene flo w esti- mation using GANs. The presence of additional discriminator network as a critic can direct the training process for scene flow estimation. Thus, the scene flo w estimation benefits from this adversarial model. This generative modeling approach can also be used for unsupervised learning of scene flow . At present, there are no large scale dataset with naturalistic scene is av ailable. Howe ver , the proposed SceneFlowGAN can be used to generate such datasets. I I . R E L A T E D W O R K S A N D B AC K G RO U N D Scene flo w estimation using deep networks is an activ e area of research [8]. W e discuss recent advances in scene flow esti- mation, generati ve adversarial networks and their applications in structure prediction problems in separate subsections. A. Scene Flow Estimation Classical scene flow estimation methods are generally based on data extracted from sequence of images obtained from multiple cameras, stereo images and depth data. Scene flow was first proposed by V edula et al. [23] using multi-view images. The y obtained multi-vie w scene flow from optical flo w and surface geometry . Usually such methods use some 3D reconstruction procedure. Scene flo w based on stereo image from binocular setting often in v olves joint estimation of optical flow and disparity [7, 26]. Though, some scene flow estimation methods are based on stereo images decoupled into stereo and motion estimation [25]. Basha et al. [2] formulated the struc- ture and scene flow in point cloud representation. Scene flow by enforcing depth discontinuity using image segmentation information was introduced by Zhang and Kambhamettu [28]. It computes both the 3D motion and the 3D structure. Most of these methods uses variational framework. Howe ver , Schuster et al. [21] proposed scene flow estimation method based on dense interpolation of sparse matches from stereo images. The variational optimization was used at later stage for refinement. W ith the advent of depth cameras, scene flow estima- tion methods using RGB-D data were explored [19, 6, 11]. Howe v er , the depth cameras are not suf ficiently accurate in outdoor en vironment. They pose limitation due to changes in illumination, frame rate and limited field of view . The classical methods of estimating motion rigidly follow the brightness and gradient constancy assumption. Howe ver , most of these assumptions do not hold true in dynamic environments. B. Structure and Motion Estimation fr om Deep Networks For motion estimation based on deep networks, the av ail- ability of large dataset was a challenge. Since acquiring motion data for naturalistic scene was tedious, Mayer et al. [16] introduced FlyingThings3D synthetic dataset. Recently , motion estimation based on deep network have shown the promise of such methods. The introduction of FlyingThings3D dataset gave boost to such CNN based methods for motion estimation. They were also the first to apply CNN for scene flow estimation by proposing SceneFlo wNet Mayer et al. [16]. SceneFlowNet used combined architecture of FlowNet[4] and DispNet[16] for estimating scene flow . This was subsequently revised in FlowNet2 [9]. They addressed the problem of large displacement by stacking different architectures of FlowNet. The small displacement was addressed using small strides in con volution layers. SpyNet [20] used spatial pyramid of input data to reduce the number of training parameters. Motiv ated by the success in estimating optical flo w through CNN, a fe w deep networks for scene flow estimation were also proposed. Ilg et al. [8] introduced stacked architecture based on FlowNet2.0 to estimate disparity and scene flow in occluded stereo images. Behl et al. [3] combined recognition with geometry information to estimate scene flow in dynamic scene with large displacement. SF-Net [18] introduced end-to- end training for scene flow estimation from RGB-D images. A CNN for direct estimation of scene flow was proposed by Thakur and Mukherjee [22]. The model SceneEDNet [22] estimates three dimensional motion from temporal sequence of stereo images, without giving geometry information. V i- jayanarasimhan et al. [24] solved for 3D motion and 3D geometry simultaneously by using two different networks for structure and motion. The SceneEDNet is a deep network for end-to-end learning of sceneflow using only stereo images, which can be fed into a GAN readily . Hence, we use the SceneEDNet architecture in the Generator part of the proposed SceneFlowGAN architecture. C. Generative Adversarial Networks The work on Generativ e Adversarial Network was proposed by Goodfellow et al. [5]. The GAN architecture consists of two networks training in an adversarial mode against each other . The generator is tasked to generate realistic images giv en a latent noise sample. While, the discriminator network is supposed to train on both real as well generated image so to be able to distinguish between the two. Both, the generator and the discriminator are in volv ed in a min-max game. This can be represented by following equation. min G max D E x P ( x ) [ log ( D ( x )] + E z P ( z ) [ log (1 − D ( G ( z ))] (1) The generator is denoted by G and discriminator by D . P ( z ) is the distribution of noise T raining of GAN has been difficult due to problems such as vanishing gradient, mode collapse. The training can also be highly unstable. W asserstien GAN [1] ov ercame some of these challenges by using EM or earth mover’ s distance as loss function. Also, in some cases the discriminator and generator loss values are not good indi- cator of training of GAN. Mirza and Osindero [17] proposed conditioning of both the generator and the discriminator on additional information a vailable with the data. This allowed to direct the training of GAN for data generation. These advances Fig. 2. The proposed architecture for SceneFlowGAN. the generator follows an encoder-decoder architecture. The networks are composed of units in the form of conv olution-batch norm-leakyReLU. This units are also part of discriminator network in GAN were followed by its application in various area of computer vision. Kupyn et al. [13] demonstrated a conditional adversarial network for restoring a blurred image. They used residual blocks for generator with perceptual loss. Zhang et al. [27] synthesized images from te xt description from stack of two GANs. The first GAN generates rough images based on text description. The second GAN is conditioned on first one to perform refinement. Generation of super-resolution from single image was achiev ed by Ledig et al. [15] using perceptual loss. A model for image to image translation [10] was proposed by conditioning both the adversarial networks on input image. A semi-supervised optical flow estimation using conditional GAN was proposed [14] using both labeled and unlabeled data. I I I . P R O P O S ED M E T H O D W e propose a conditional adversarial network for estimating scene flow from pairs of stereo images. The weights of the generator and discriminator are updated together during the training phase. The learning of optical flo w and disparity are coupled in SceneFlowGAN. A. Scene Flow Estimation Giv en stereo image pairs at consecutive time instances the scene flow can be constructed from optical flow ( u, v ) and disparity ( d t , d t +1 ). The dense scene flow provides 3D position and the constituent 3D motion vector for all the points. Our proposed method takes set of stereo images defined by I = ( I t L , I t +1 L , I t R , I t +1 R ) to generate scene flow S . Thus, the scene flow can be considered as a 4D vector . S = ( u, v , d t , d t +1 ) . The horizontal and vertical components of optical flow is represented by u and v respectiv ely . Disparities of stereo pairs at t and t + 1 are denoted by d t and d t +1 . In point cloud the scene flow can be computed using the camera parameters and pinhole projection model. When projected on the image plane, the scene flow giv es corresponding optical flow . B. Adversarial T raining For training SceneFlo wGAN, the loss function is com- puted twice, one at the end of discriminator and other at the generator’ s. The discriminator’ s loss makes the network learn to identify ground truth and generated scene flo w . The discriminator is not conditioned like the one proposed in [17]. The loss at the end of generator G makes sure that network is optimized for scene flow estimation task from pair of stereo images. At the same time, the generator is also trained to pass the critic test by discriminator . This one-to-many mapping directs the training of generator for scene flow estimation task. L = L GAN + L J ointLoss For training the generator we define a joint loss function. It is composed of av erage end point error for optical flow and an L 1 loss for calculating the error between the two disparity values. The optical flow error is the a verage end-point-error . The error in disparity is given by L1 loss. W e use wasserstein metric as GAN loss function for stable training using gradient descent[1] . The joint loss function can be giv en as below L j ointloss =Σ p ( u − u 0 ) 2 + ( v − v 0 ) 2 + Σ ( | d t − d 0 t | ) + Σ  | d t +1 − d 0 t +1 |  (2) The GAN loss takes the decision on input scene flow being real or generated. The conditioning of generator on input stereo images makes generator learn to estimate scene flow and also to fool the discriminator . Thus, the SceneFlowGAN is trained to optimize the following objectiv e function min G max D L GAN ( G, D ) + L J ointLoss ( G ) (3) The discriminator network tries to maximize the objectiv e function while the generator tries to minimize it. C. Arc hitectur e of the Pr oposed Model The architecture of SceneFlowGAN is shown in 2. The model consists of a generator and discriminator network. Both the networks are con volutional. For the generator we hav e used SceneEDNet[22]. Unlike [22] we have used batch- normalization layers for regularization. The network needs to learn correspondences between the stereo pairs. Howe v er , much information is lost while propagating from encoding to decoding stage. Thus, we hav e used skip connections between layers with same dimension in the encoder and decoder part. The composition unit of the generator and discriminator networks are of the form con volution-batch normalization- leakyReLU. The discriminator network has three conv olution layer each followed by batch-normalization and leakyReLU. The final con v olution layer is flattened to connect to set of three dense layers. There is dropout layer with value of 0.4. The last dense layer gi ves probability value of the scene flow being generated or real. I V . E X P E R I M E N T S A N D R E S U LT S W e describe the datasets being used for training of Scene- FlowGAN follo wed by implementation and results. A. Dataset For training SceneFlowGAN we hav e used the lar ge scene flow dataset Mayer et al. [16]. The dataset is di vided into three sections, FlyingThings3D, Monkaa and Driving. All the datasets pro vides 3D scene points. The 3D models were used to create frame artificially used Blender . The scenes are rendered in a way to provide variation in orientation and position for all the visible scene points. The datasets comes with bi-directional optical flow and bi-directional disparity ground truths. The stereo images are av ailable in two formats. One is clean pass, with no noise or external effects. Other is final pass, which comes with motion blur , illumination effects and image degradations. For training we hav e used FlyingThings3D dataset with final pass images. Fig. 3. T raining procedure for SceneFlo wGAN. The discriminator and generator are trained in alternating manner . B. Implementation Details The estimated scene flow is conditioned on the input stereo pairs at consecutiv e time instances. For the generator G archi- tecture we have used SceneEDNet[22] with skip-connection. The discriminator D is unconditoned, which is trained to distinguish between generated and ground truth scene flow . During the training, both the network are trained in adversarial manner . For training the SceneFlo wGAN we follo w the procedure mentioned in original work[5] as shown in 3. W e alternate between the training of discriminator and the generator . The discriminator network is trained on both, the ground truth and the generated scene flo w . The generator is trained via GAN by making the weights of discriminator frozen. Both the network were trained with Adam[12] optimizer with learning rate of 1e- 5. The calculation of loss happens at two places, one at each of generator and discriminator’ s end. All the training were performed on NVIDIA-1080 GPUs. C. Results The SceneFlowGAN was trained on FlyingThings3D[16] dataset. The learning of optical flow and disparity are coupled. For a input pairs of I = ( I t L , I t +1 L , I t R , I t +1 R ) we hav e obtained corresponding optical flow and disparities for consecuti ve time instances. The model was trained on final pass stereo images with added image degradations. The results on a stereo pair is showed in 4 D. Ablation Studies The choice of dataset for training was based on training performance of the SceneEDNet [22] on three sets of Fly- ingThings3D. The training loss curve for SceneEDNet is giv en in 5. The drop in the av erage end point error was more for set-B and set-C as compared to A. Moreover , we also observed the drop in the loss value due to additional batch- normalization layers. W e trained our model on set-A and set-C Fig. 4. The predicted scene flow from SceneFlowGAN trained on set A and C of FlyingThinsg3D for a pair of stereo images. From left to right. Left and right stereo pair ov erlaid, ground truth disparity and optical flow . The predictions (from left to right) are for SceneFlowGAN trained on set A for 70 epcohs and 50 epcohs, trained on set C for 70 and 50 epochs respectiv ely . Fig. 5. T raining loss of SceneEDNet on three sets of FlyingThings3D scene flow . The original SceneEDNet does not have batch-normalization layers. W e found decrease in training loss its introduction. of FlyingThings3D scene flow data. This was done to see the effect of data distribution on learning the generator . I shows the flow and disparity error obtained by SceneFlowGAN on all the test sets of FlyingThings3D. The test was done for both the models trained on A and C. V . C O N C L U S I O N S In this paper we hav e a presented a conditional generative adversarial network to estimate the scene flo w from stereo im- ages. The training of the SceneFlowGAN remains a challenge giv en the complexity of the problem. The choice of generator was dependent on the training loss obtained in training the generator separately . In future the proposed GAN based scene flow estimation method can be extended for applying on naturalistic images after creating a suf ficiently large dataset, which may lead to a new direction of research on flow field estimation. R E F E R E N C E S [1] Martin Arjovsky , Soumith Chintala, and L ´ eon Bottou. W asserstein gan. arXiv preprint , 2017. [2] T ali Basha, Y ael Moses, and Nahum Kiryati. Multi- view scene flow estimation: A view centered variational approach. International journal of computer vision , 101 (1):6–21, 2013. [3] Aseem Behl, Omid Hosseini Jafari, Siv a Karthik Mustikovela, Hassan Abu Alhaija, Carsten Rother , and Andreas Geiger . Bounding Boxes, Segmentations and Object Coordinates: How Important Is Recognition for 3D Scene Flow Estimation in Autonomous Dri ving Scenarios? In The IEEE International Conference on Computer V ision (ICCV) , Oct 2017. [4] Alexe y Dosovitskiy , Philipp Fischer , Eddy Ilg, Philip Hausser , Caner Hazirbas, Vladimir Golkov , Patrick van der Smagt, Daniel Cremers, and Thomas Brox. FlowNet: Learning Optical Flo w With Conv olutional Networks. In The IEEE International Confer ence on Computer V ision (ICCV) , December 2015. [5] Ian Goodfellow , Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David W arde-F arley , Sherjil Ozair, Aaron Courville, and Y oshua Bengio. Generativ e adversarial nets. In Ad- vances in neural information pr ocessing systems , pages 2672–2680, 2014. [6] Evan Herbst, Xiaofeng Ren, and Dieter Fox. Rgb- d flow: Dense 3-d motion estimation using color and depth. In Robotics and Automation (ICRA), 2013 IEEE International Confer ence on , pages 2276–2282. IEEE, 2013. [7] Fr ´ ed ´ eric Huguet and Fr ´ ed ´ eric Dev ernay . A v ariational method for scene flow estimation from stereo sequences. In Computer V ision, 2007. ICCV 2007. IEEE 11th Inter - national Conference on , pages 1–7. IEEE, 2007. SceneFlo wGAN-A(70) SceneFlo wGAN-C(70) SceneFloGAN-A(50) SceneFlo wGAN-C(50) Flo w d 1 d 2 Flo w d 1 d 2 Flo w d 1 d 2 Flo w d 1 d 2 A 72.33 33.68 32.82 72.11 37.37 39.12 71.50 35.61 35.33 72.27 36.29 37.89 B 33.89 31.15 29.73 28.99 34.09 34.91 31.13 32.67 32.07 29.18 33.16 34.12 C 25.18 32.72 30.89 19.06 35.40 35.66 22.08 34.28 32.68 19.86 34.54 34.86 T ABLE I F L OW A N D D IS PA RI T Y E R RO R O B T A IN E D F O R S C EN E F L OW GA N . T H E E R RO R VAL U E S A RE O B T A I NE D A F T ER T E ST I N G B OT H T H E T R A I NE D M O D E LS ( A , C ) O N T E ST S E T ( A , B , C ) . T H E V A LU E I N T H E B R AC K ET A F TE R R M O DE L N A ME S H OW S N UM B E R O F E P OC H S . [8] E. Ilg, T . Saikia, M. Keuper , and T . Brox. Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity , Optical Flow or Scene Flow Estimation. In Eur opean Confer ence on Computer V ision (ECCV) , 2018. [9] Eddy Ilg, Nikolaus Mayer , T onmoy Saikia, Margret Keu- per , Alexey Dosovitskiy , and Thomas Brox. Flownet 2.0: Evolution of optical flo w estimation with deep networks. In 2017 IEEE confer ence on computer vision and pattern r ecognition (CVPR) , pages 1647–1655. IEEE, 2017. [10] Phillip Isola, Jun-Y an Zhu, T inghui Zhou, and Alexei A Efros. Image-to-image translation with conditional ad- versarial networks. 2017. [11] Mariano Jaimez, Mohamed Souiai, Javier Gonzalez- Jimenez, and Daniel Cremers. A primal-dual frame work for real-time dense RGB-D scene flow. In Robotics and Automation (ICRA), 2015 IEEE International Confer ence on , pages 98–104. IEEE, 2015. [12] Diederik P Kingma and Jimmy Lei Ba. Adam:A method for stochastic optimization. 2014. [13] Orest Kupyn, V olodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Ji ˇ r ´ ı Matas. DeblurGAN: Blind Mo- tion Deblurring Using Conditional Adversarial Networks. In Pr oceedings of the IEEE Conference on Computer V ision and P attern Recognition , pages 8183–8192, 2018. [14] W ei-Sheng Lai, Jia-Bin Huang, and Ming-Hsuan Y ang. Semi-supervised learning for optical flow with generative adversarial networks. In Advances in Neural Information Pr ocessing Systems , pages 354–364, 2017. [15] Christian Ledig, Lucas Theis, Ferenc Husz ´ ar , Jose Ca- ballero, Andrew Cunningham, Alejandro Acosta, An- drew P Aitken, Alykhan T ejani, Johannes T otz, Ze- han W ang, et al. Photo-Realistic Single Image Super- Resolution Using a Generati ve Adversarial Network. In CVPR , volume 2, page 4, 2017. [16] Nikolaus Mayer, Eddy Ilg, Philip Hausser , Philipp Fis- cher , Daniel Cremers, Alexey Dosovitskiy , and Thomas Brox. A large dataset to train con volutional networks for disparity , optical flow , and scene flo w estimation. In Pr oceedings of the IEEE Confer ence on Computer V ision and P attern Recognition , pages 4040–4048, 2016. [17] Mehdi Mirza and Simon Osindero. Conditional Gener - ativ e Adversarial Nets. arXiv pr eprint arXiv:1411.1784 , 2014. [18] Y i-Ling Qiao, Lin Gao, Y u-Kun Lai, Fang-Lue Zhang, Mingzhe Y uan, and Shihong Xia. SF-Net: Learning Scene Flow from RGB-D Images with CNNs. In British Machine V ision Conference 2018, BMVC 2018, Northum- bria University , Newcastle, UK, September 3-6, 2018 , page 281, 2018. [19] Julian Quiroga, Thomas Brox, Fr ´ ed ´ eric Dev ernay , and James Cro wley . Dense semi-rigid scene flo w estimation from rgbd images. In Eur opean Confer ence on Computer V ision , pages 567–582. Springer , 2014. [20] Anurag Ranjan and Michael J Black. Optical Flow Estimation using a Spatial Pyramid Network. In IEEE Confer ence on Computer V ision and P attern Recognition (CVPR) , pages 2720–2729. IEEE, 2017. [21] Ren ´ e Schuster , Oliver W asenmuller, Georg K uschk, Christian Bailer, and Didier Stricker . Sceneflowfields: Dense interpolation of sparse scene flow correspon- dences. In 2018 IEEE W inter Confer ence on Applications of Computer V ision (W A CV) , pages 1056–1065. IEEE, 2018. [22] Ravi Kumar Thakur and Snehasis Mukherjee. Sce- neEDNet: A Deep Learning Approach for Scene Flow Estimation. In 2018 15th International Confer ence on Contr ol, Automation, Robotics and V ision (ICARCV) , pages 394–399. IEEE, 2018. [23] Sundar V edula, Simon Baker , Peter Rander , Robert Collins, and T akeo Kanade. Three-dimensional scene flow. In Computer V ision, 1999. The Proceedings of the Seventh IEEE International Conference on , volume 2, pages 722–729. IEEE, 1999. [24] Sudheendra V ijayanarasimhan, Susanna Ricco, Cordelia Schmid, Rahul Sukthankar , and Katerina Fragkiadaki. Sfm-net: Learning of structure and motion from video. arXiv preprint arXiv:1704.07804 , 2017. [25] Andreas W edel, Thomas Brox, T obi V audrey , Clemens Rabe, Uwe Franke, and Daniel Cremers. Stereoscopic scene flow computation for 3D motion understanding. International Journal of Computer V ision , 95(1):29–51, 2011. [26] Koichiro Y amaguchi, David McAllester, and Raquel Ur- tasun. Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In Eur opean Confer ence on Computer V ision , pages 756–771. Springer, 2014. [27] Han Zhang, T ao Xu, Hongsheng Li, Shaoting Zhang, Xi- aogang W ang, Xiaolei Huang, and Dimitris N. Metaxas. StackGAN: T ext to Photo-Realistic Image Synthesis W ith Stacked Generativ e Adversarial Networks. In ICCV) , Oct 2017. [28] Y e Zhang and Chandra Kambhamettu. On 3D scene flow and structure estimation. In Computer V ision and P attern Recognition, 2001. CVPR 2001. Pr oceedings of the 2001 IEEE Computer Society Confer ence on , volume 2, pages II–II. IEEE, 2001.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment