來源:Google、iangoodfellow.com、新智元
今天,2018年計算機視覺和模式識別會議(CVPR 2018)正在鹽湖城舉辦,這是計算機視覺領(lǐng)域最重要的年度學(xué)術(shù)會議,包括主大會和若干workshop和tutorial。作為會議的鉆石贊助商,谷歌在今年的CVPR上同樣表現(xiàn)強勢,有超過200名谷歌員工將在大會上展示論文或被邀請演講,谷歌也組織和參與了多個研討會。
根據(jù)谷歌官方博客,CVPR 2018谷歌共有45篇論文被接收。這些論文關(guān)注下一代智能系統(tǒng)和機器感知領(lǐng)域的最新機器學(xué)習(xí)技術(shù),包括Pixel 2和Pixel 2 XL智能手機的人像模式背后的技術(shù),V4版本的Open Images數(shù)據(jù)集等等。
Google at CVPR 2018
組織者
財務(wù)主席:Ramin Zabih
領(lǐng)域主席:Sameer Agarwal, Aseem Agrawala, Jon Barron, Abhinav Shrivastava, Carl Vondrick, Ming-Hsuan Yang
論文列表
Orals/Spotlights
作為結(jié)構(gòu)表示的對象標(biāo)志的無監(jiān)督發(fā)現(xiàn)
Unsupervised Discovery of Object Landmarks as Structural Representations
Yuting Zhang, Yijie Guo, Yixin Jin, Yijun Luo, Zhiyuan He, Honglak Lee
DoubleFusion:利用單個深度傳感器實時捕捉人體的內(nèi)體形狀
DoubleFusion: Real-time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor
Tao Yu, Zerong Zheng, Kaiwen Guo, Jianhui Zhao, Qionghai Dai, Hao Li, Gerard Pons-Moll, Yebin Liu
用于無監(jiān)督運動重定向的神經(jīng)運動網(wǎng)絡(luò)
Neural Kinematic Networks for Unsupervised Motion Retargetting
Ruben Villegas, Jimei Yang, Duygu Ceylan, Honglak Lee
用核預(yù)測網(wǎng)絡(luò)去噪
Burst Denoising with Kernel Prediction Networks
Ben Mildenhall, Jiawen Chen, Jonathan Barron, Robert Carroll, Dillon Sharlet, Ren Ng
神經(jīng)網(wǎng)絡(luò)的量化和訓(xùn)練,以實現(xiàn)高效的整數(shù)運算推理
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob, Skirmantas Kligys, Bo Chen, Matthew Tang, Menglong Zhu, Andrew Howard, Dmitry Kalenichenko, Hartwig Adam
AVA:一個時空本地化原子視覺動作視頻數(shù)據(jù)集
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu, Chen Sun, David Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik
視覺問答的視覺-文本注意力焦點
Focal Visual-Text Attention for Visual Question Answering
Junwei Liang, Lu Jiang, Liangliang Cao, Li-Jia Li, Alexander G. Hauptmann
推斷來自陰影中的光場
Inferring Light Fields from Shadows
Manel Baradad, Vickie Ye, Adam Yedida, Fredo Durand, William Freeman, Gregory Wornell, Antonio Torralba
修改多個視圖中的非本地變量
Modifying Non-Local Variations Across Multiple Views
Tal Tlusty, Tomer Michaeli, Tali Dekel, Lihi Zelnik-Manor
超越卷積的迭代視覺推理
Iterative Visual Reasoning Beyond Convolutions
Xinlei Chen, Li-jia Li, Fei-Fei Li, Abhinav Gupta
3D形變模型回歸的無監(jiān)督訓(xùn)練
Unsupervised Training for 3D Morphable Model Regression
Kyle Genova, Forrester Cole, Aaron Maschinot, Daniel Vlasic, Aaron Sarna, William Freeman
學(xué)習(xí)可擴展圖像識別的可轉(zhuǎn)換架構(gòu)
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc Le
生物物種分類和檢測數(shù)據(jù)集
The iNaturalist Species Classification and Detection Dataset
Grant van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, Serge Belongie
利用觀察世界來學(xué)習(xí)內(nèi)在的圖像分解
Learning Intrinsic Image Decomposition from Watching the World
Zhengqi Li, Noah Snavely
學(xué)習(xí)智能對話框用于邊界框注釋
Learning Intelligent Dialogs for Bounding Box Annotation
Ksenia Konyushkova, Jasper Uijlings, Christoph Lampert, Vittorio Ferrari
Posters
重新審視訓(xùn)練對象類別檢測器的知識遷移
Revisiting Knowledge Transfer for Training Object Class Detectors
Jasper Uijlings, Stefan Popov, Vittorio Ferrari
重新思考用Faster R-CNN架構(gòu)進行時間動作定位
Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Yu-Wei Chao, Sudheendra Vijayanarasimhan, Bryan Seybold, David Ross, Jia Deng, Rahul Sukthankar
視覺對象識別的層次式新穎性檢測
Hierarchical Novelty Detection for Visual Object Recognition
Kibok Lee, Kimin Lee, Kyle Min, Yuting Zhang, Jinwoo Shin, Honglak Lee
COCO-Stuff:語境中的事物和材料類別
COCO-Stuff: Thing and Stuff Classes in Context
Holger Caesar, Jasper Uijlings, Vittorio Ferrari
用于視頻分類的外觀關(guān)系網(wǎng)絡(luò)
Appearance-and-Relation Networks for Video Classification
Limin Wang, Wei Li, Wen Li, Luc Van Gool
MorphNet:深度網(wǎng)絡(luò)的快速簡單資源約束結(jié)構(gòu)學(xué)習(xí)
MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks
Ariel Gordon, Elad Eban, Bo Chen, Ofir Nachum, Tien-Ju Yang, Edward Choi
圖形卷積自動編碼器的可變形形狀補完
Deformable Shape Completion with Graph Convolutional Autoencoders
Or Litany, Alex Bronstein, Michael Bronstein, Ameesh Makadia
MegaDepth:從互聯(lián)網(wǎng)照片學(xué)習(xí)單視圖深度預(yù)測
MegaDepth: Learning Single-View Depth Prediction from Internet Photos
Zhengqi Li, Noah Snavely
作為結(jié)構(gòu)表示的對象標(biāo)志的無監(jiān)督發(fā)現(xiàn)
Unsupervised Discovery of Object Landmarks as Structural Representations
Yuting Zhang, Yijie Guo, Yixin Jin, Yijun Luo, Zhiyuan He, Honglak Lee
用核預(yù)測網(wǎng)絡(luò)去噪
Burst Denoising with Kernel Prediction Networks
Ben Mildenhall, Jiawen Chen, Jonathan Barron, Robert Carroll, Dillon Sharlet, Ren Ng
神經(jīng)網(wǎng)絡(luò)的量化和訓(xùn)練,以實現(xiàn)高效的整數(shù)運算推理
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob, Skirmantas Kligys, Bo Chen, Matthew Tang, Menglong Zhu, Andrew Howard, Dmitry Kalenichenko, Hartwig Adam
Pix3D:單圖像3D形狀建模的數(shù)據(jù)集和方法
Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling
Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Tianfan Xue, Joshua Tenenbaum,William Freeman
用于表示和編輯圖像的稀疏智能輪廓
Sparse, Smart Contours to Represent and Edit Images
Tali Dekel, Dilip Krishnan, Chuang Gan, Ce Liu, William Freeman
MaskLab:通過使用語義和方向特征優(yōu)化對象檢測進行實例分割
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
Liang-Chieh Chen, Alexander Hermans, George Papandreou, Florian Schroff, Peng Wang,Hartwig Adam
大規(guī)模細粒度分類和領(lǐng)域特定的遷移學(xué)習(xí)
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie
改進的帶有初始值和空間自適應(yīng)比特率的有損網(wǎng)絡(luò)壓縮
Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks
Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Sung Jin Hwang, George Toderici, Troy Chinen, Joel Shor
MobileNetV2:反向殘差和線性瓶頸
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen
ScanComplete:3D掃描的大規(guī)模場景補完和語義分割
ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans
Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Juergen Sturm, Matthias Nie?ner
Sim2Real通過循環(huán)控制查看不變視覺伺服
Sim2Real View Invariant Visual Servoing by Recurrent Control
Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine
Alternating-Stereo VINS:可觀測性分析和性能評估
Alternating-Stereo VINS: Observability Analysis and Performance Evaluation
Mrinal Kanti Paul, Stergios Roumeliotis
桌上足球
Soccer on Your Tabletop
Konstantinos Rematas, Ira Kemelmacher, Brian Curless, Steve Seitz
使用3D幾何約束從單眼視頻中無監(jiān)督地學(xué)習(xí)深度和自我運動
Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
Reza Mahjourian, Martin Wicke, Anelia Angelova
AVA:一個時空本地化原子視覺動作視頻數(shù)據(jù)集
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu, Chen Sun, David Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik
推斷來自陰影中的光場
Inferring Light Fields from Shadows
Manel Baradad, Vickie Ye, Adam Yedida, Fredo Durand, William Freeman, Gregory Wornell, Antonio Torralba
修改多個視圖中的非本地變量
Modifying Non-Local Variations Across Multiple Views
Tal Tlusty, Tomer Michaeli, Tali Dekel, Lihi Zelnik-Manor
用于單目深度估計的孔徑監(jiān)控
Aperture Supervision for Monocular Depth Estimation
Pratul Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan Barron
實例嵌入轉(zhuǎn)移到無監(jiān)督視頻對象分割
Instance Embedding Transfer to Unsupervised Video Object Segmentation
Siyang Li, Bryan Seybold, Alexey Vorobyov, Alireza Fathi, Qin Huang, C.-C. Jay Kuo
幀回放視頻超分辨率
Frame-Recurrent Video Super-Resolution
Mehdi S. M. Sajjadi, Raviteja Vemulapalli, Matthew Brown
稀疏時間池網(wǎng)絡(luò)的弱監(jiān)督動作定位
Weakly Supervised Action Localization by Sparse Temporal Pooling Network
Phuc Nguyen, Ting Liu, Gautam Prasad, Bohyung Han
超越卷積的迭代視覺推理
Iterative Visual Reasoning Beyond Convolutions
Xinlei Chen, Li-jia Li, Fei-Fei Li, Abhinav Gupta
學(xué)習(xí)和使用時間箭頭
Learning and Using the Arrow of Time
Donglai Wei, Andrew Zisserman, William Freeman, Joseph Lim
HydraNets:高效推理的專用動態(tài)架構(gòu)
HydraNets: Specialized Dynamic Architectures for Efficient Inference
Ravi Teja Mullapudi, Noam Shazeer, William Mark, Kayvon Fatahalian
在有限的監(jiān)督下進行胸部疾病的識別和定位
Thoracic Disease Identification and Localization with Limited Supervision
Zhe Li, Chong Wang, Mei Han, Yuan Xue, Wei Wei, Li-jia Li, Fei-Fei Li
推斷分層文本-圖像合成的語義布局
Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
Seunghoon Hong, Dingdong Yang, Jongwook Choi, Honglak Lee
深層語義的臉部去模糊
Deep Semantic Face Deblurring
Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang
3D形變模型回歸的無監(jiān)督訓(xùn)練
Unsupervised Training for 3D Morphable Model Regression
Kyle Genova, Forrester Cole, Aaron Maschinot, Daniel Vlasic, Aaron Sarna, William Freeman
學(xué)習(xí)可擴展圖像識別的可轉(zhuǎn)換架構(gòu)
Learning Transferable Architectures for Scalable Image Recognition
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc Le
利用觀察世界來學(xué)習(xí)內(nèi)在的圖像分解
Learning Intrinsic Image Decomposition from Watching the World
Zhengqi Li, Noah Snavely
PiCANet:針對像素級的上下文注意力,以檢測顯著性
PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection
Nian Liu, Junwei Han, Ming-Hsuan Yang
Tutorials
機器人和駕駛中的計算機視覺
Computer Vision for Robotics and Driving
Anelia Angelova, Sanja Fidler
無監(jiān)督視覺學(xué)習(xí)
Unsupervised Visual Learning
Pierre Sermanet, Anelia Angelova
UltraFast 3D感應(yīng),重建和理解人物、物體和環(huán)境
UltraFast 3D Sensing, Reconstruction and Understanding of People, Objects and Environments
Sean Fanello, Julien Valentin, Jonathan Taylor, Christoph Rhemann, Adarsh Kowdle, Jürgen Sturm, Christine Kaeser-Chen, Pavel Pidlypenskyi, Rohit Pandey, Andrea Tagliasacchi, Sameh Khamis, David Kim, Mingsong Dou, Kaiwen Guo, Danhang Tang, Shahram Izadi
生成對抗網(wǎng)絡(luò)
Generative Adversarial Networks
Jun-Yan Zhu, Taesung Park, Mihaela Rosca, Phillip Isola, Ian Goodfellow
Ian Goodfellowa:生成對抗網(wǎng)絡(luò)(35 PPT)
生成建模:密度估計
訓(xùn)練數(shù)據(jù)→密度函數(shù)
生成建模:樣本生成
訓(xùn)練數(shù)據(jù)(CelebA)→樣本生成
對抗網(wǎng)絡(luò)的框架
Self-Attention GAN
ImageNet上最優(yōu)的FID:1000個類別,128x128 像素
Self-Play
用GAN能做什么呢?
模擬環(huán)境和訓(xùn)練數(shù)據(jù)
缺失數(shù)據(jù)
半監(jiān)督學(xué)習(xí)
多個正確答案
逼真的生成任務(wù)
基于模型的優(yōu)化
自動化定制
域適應(yīng)
自動駕駛數(shù)據(jù)集
用于模擬訓(xùn)練數(shù)據(jù)的GAN
GAN用于缺失數(shù)據(jù)
從上面這張圖像能看出什么呢?
用GAN模型看出它是一張臉
GAN用于半監(jiān)督學(xué)習(xí)
用于半監(jiān)督學(xué)習(xí)的有監(jiān)督鑒別器
半監(jiān)督分類
MNIST: 100訓(xùn)練標(biāo)簽 -> 80 測試錯誤
SVHN: 1000 訓(xùn)練標(biāo)簽 -> 4.3% 測試誤差
CIFAR-10: 4000 標(biāo)簽 -> 14.4% 測試誤差
GAN用于下一幀視頻的預(yù)測
GAN用于逼真的生成任務(wù)
iGAN
圖像到圖像翻譯
無監(jiān)督的圖像到圖像翻譯
CycleGAN
文本-圖像合成
GAN用于基于模型的優(yōu)化
設(shè)計DNA以優(yōu)化蛋白質(zhì)結(jié)合的研究
GAN用于自動化定制
個性化的GANufacturing
GAN用于域自適應(yīng)
域?qū)咕W(wǎng)絡(luò)
GAN的一些技巧
在鑒別器和生成器中 (Zhang et al 2018) 都進行頻譜歸一化 (Miyato et al 2017)
生成器和鑒別器的學(xué)習(xí)率不同(Heusel et al 2017)
不需要比生成器更頻繁地運行鑒別器(Zhang et al 2018)
許多不同的損失函數(shù)都能很好地工作(Lucic et al 2017); 可以花費更多時間調(diào)整超參數(shù),而不是嘗試不同的損失函數(shù)
地址:https://ai.googleblog.com
https://www.iangoodfellow.com/slides/2018-06-18.pdf
轉(zhuǎn)載請注明來自夕逆IT,本文標(biāo)題:《夏普SH6228C玩機技巧—中關(guān)村在線手機論壇》

還沒有評論,來說兩句吧...