Reboot: Accelerating Deep Neural Nets by Reducing Image Resolution

A few months ago, I blogged about how to reduce AlexNet image resolution without giving up much ImageNet accuracy. I’m not quite sure what I did wrong, but I’m not able to reproduce those results. I’m sorry for leading you down the wrong path, and thanks to the 10 people who wrote me with questions about this!

New numbers
Here’s a revised version with numbers that I’ve been able to produce consistently with ImageNet-1k on an NVIDIA K40, where all the accuracy deltas are compared to 256×256 AlexNet:

DNN Architecture Input Crop Top-1 accuracy Top-5 accuracy Frame rate at test-time
AlexNet [1] 256×256 227×227 57.1% 80.2% 624 fps
AlexNet 128×128 99×99 42.7% (-14.4) 67.3% (-12.8) 3368 fps (5.4x speedup)
AlexNet 128×128 111×111 46.2% (-10.9) 70.1% (-10.1) 2191 fps (3.4x speedup)
VGG_F [2][3] 128×128 99×99 41.2% (-15.9) 65.7% (-14.5) 2876 fps (4.6x speedup)
VGG_F_extralayer [4] 128×128 99×99 48.3% (-8.8) 72.8% (-7.4) 1600 fps (2.5x speedup)
VGG_F_extralayer 128×128 111×111 50.2% (-6.9) 75.1% (-5.1) 1248 fps (2x speedup)

As you can see, the drop in accuracy for 128×128 AlexNet is larger than what I listed in my previous blog post. Oops.

After trying a few other DNN architectures, I identified an architecture that I’m calling VGG_F_extralayer [4]. With VGG_F_extralayer, we claw our way back up above 50% top-1 accuracy, while maintaining some speed benefits due to 128×128 images.

There are a few differences between VGG_F and VGG_F_extralayer:
1. VGG_F_extralayer has an additional 1×1 conv layer with 256 filters after conv4. (Going deeper sometimes improves accuracy.)
2. In its final pooling layer, VGG_F_extralayer does average pooling instead of max pooling. (In general, I often find that average-pooling near the end of a DNN provides a moderate bump in accuracy.)
3. The conv1 layer has 5×5 instead of 11×11 filters. (11×11 would probably give similar accuracy.)
4. The strides in VGG_F_extralayer for conv1 and pool1 are slightly different than VGG_F (see the details in the VGG_F_extralayer prototxt file [4]).

What’s next?
There are plenty of open questions, such as “which of these modifications have the biggest impact on accuracy?” I invite you to explore them.

[1] A. Krizhevsky, I. Sutskever, G.E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS, 2012.
[2] K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman. Return of the Devil in the Details: Delving Deep into Convolutional Nets. BMVC, 2014 .
[3] VGG_F prototxt: https://gist.github.com/ksimonyan/a32c9063ec8e1118221a
[4] VGG_F_extraLayer prototxt: http://www.forrestiandola.com/blog/wp-content/uploads/2015/07/vgg_f_extraLayer_trainval.prototxt

Accelerating AlexNet by Reducing Image Resolution

Running AlexNet in Caffe with 128×128 instead of 256×256 images, we observed a 5.4x speedup and a <2 percentage point drop in ImageNet accuracy:

Input Crop Top-1 accuracy Top-5 accuracy Frame rate at test-time
256×256 227×227 57.1% 80.2% 624 fps
128×128 99×99 55.9% (-1.2) 78.7% (-1.5) 3368 fps (5.4x speedup)

Reducing the input data size reduces the amount of work that every convolutional layer needs to perform.

Details:

  • In Caffe’s default AlexNet configuration, we train and test with 256×256 images, with randomized 227×227 crops for training and central 227×227 crops for testing.
  • In our 128×128 experiment, we train and test with 99×99 crops. (256-227=29, and 128-99=29.) 99×99 crops contain 5.25x fewer pixels than 227×227 crops. Other than this dimension change, our 128×128 experiments are identical to the default Caffe AlexNet configuration.
  • Speed tests were performed on an NVIDIA K40 with CUDA 6.5, and Caffe compiled with cuDNN version 1.
  • For 256×256 images, Alex Krizhevsky et al reported slightly higher accuracy (~82% top-5) than we are achieving in Caffe. This may be related to data augmentation settings in training and/or testing.
  • Trained on ILSVRC2012-train, tested on ILSVRC2012-val.

Scrambled Eggs

Making scrambled eggs is as easy as “scramble some eggs and cook them,” right? Well, sort of. There are a lot of subtle choices!

Martha Stewart endorses the “scramble some eggs and cook them” philosophy.

Dani Spies adds water when beating the eggs, and she cooks them in coconut oil.

Gordon Ramsay adds a few more flourishes, taking the eggs on and off the heat. This, combined with continuous stirring of the eggs, produces more of a custard than traditional scrambled eggs.

NCSA Eco-G Cluster Wins Award, Forrest Discusses it on Television

The latest Green500 rankings were released last week at Supercomputing 2010! My work on the NCSA EcoG cluster with Profs. William D. Gropp and Wen-Mei Hwu and research scientist Mike Showerman earned two Green500 awards.

In light of this, Prof. Wen-Mei Hwu and I were featured on yesterday’s Champaign-Urbana news:

Forrest Iandola - Green500 News Video

According to Green500, EcoG is the greenest self-built supercomputer in the world. When compared to self-built clusters and commercially-built machines from IBM, Cray, and others, the Green500 lists the EcoG cluster as the world’s 3rd greenest supercomputer. IBM’s NNSA/SC Blue Gene/Q Prototype tops the list with 1684.20 MFLOPS/W. Second is Tokyo Institute of Technology’s commercially-built computer, which uses NVIDIA GPUs to produce 958.35 MFLOPS/W.

Producing 933.06 MFLOPS per Watt, EcoG was designed and constructed for a fraction of the cost of these machines. EcoG demonstrates the possibilities for constructing fast, efficient, world-class supercomputers without multi-million dollar grants or large industry budgets.

Mike Showerman (NCSA) and I received the awards at SC10:
Forrest Iandola and Mike Showerman Receiving Green500 Awards

The world’s greenest supercomputer is receiving a decent amount of press coverage, including an article that NCSA’s Allison Copenbarger printed earlier this month. The article has been syndicated by the University of Illinois College of Engineering, the Department of Computer Science, the Department of Electrical and Computer Engineering, and the Coordinated Science Laboratory. Even the Communications of the ACM newsletter published a version of the NCSA article.

Green500 Cluster Project Featured in NCSA Newsletter

Forrest Iandola - Green500 List

My work with Profs. William D. Gropp and Wen-Mei Hwu and research scientist Mike Showerman is featured in the National Center for Supercomputing Applications (NCSA) Newsletter!

In March 2010, a team of fifteen University of Illinois and NCSA computer scientists including professors, graduate students, and undergrads set out push the limits of energy efficiency in cluster computing. Toward this goal, we collaborated with NVIDIA’s Sean Treichler to design a 128-node cluster that uses commodity hardware and NVIDIA graphics processing units (GPUs). Earlier this fall, we constructed and benchmarked the cluster for efficiency and performance.

The jury is still out as to where the our new 128-node cluster ranks on the Green500, which is a list of the world’s most efficient clusters and supercomputers. The Green500 rankings are slated to be announced at the the upcoming Supercomputing conference.

Forrest in Video for Incoming University of Illinois Students

During Fall 2009 and Spring 2010, I led a course on Entrepreneurship and Innovation at the University of Illinois. The course was especially targeted toward first-year engineering students interested in technical startups and social entrepreneurship. In addition to Entrepreneurship and Innovation, the iFoundry program offers a plethora of courses and opportunities for engineering students.

The enrollment of students in iFoundry courses quadrupled between Fall 2009 and Fall 2010. Along with star students Kevin Wolz, Claire Slupski, Aman Kapur, Jenny Roderick, and Jaime Kelleher; and course staff Lisa Mazzocco and Britany Patterson, I offer a few highlights about the iFoundry program and our accomplishments.

Incoming University of Illinois engineering students watched this video on the first day of Fall 2010 iFoundry classes.

Best Computer Science Interview Question Ever

Forrest Iandola Reenacting the Best Computer Science Interview Question Ever

While perusing the blog of Chris Sells of Microsoft, I came across the best interview question I’ve ever seen:

I walked into my first technical interview at Microsoft, and before I could say anything, the woman says, You’re in an 8×8 stone corridor. I blink and sit down.

Interviewer: The prince of darkness appears before you.

Me: You mean, like, the devil?

Interviewer: Any prince of darkness will do.

Me: Ok.

Interviewer: What do you do?

Me: Can I run?

Interviewer: Do you want to run?

Me: Hmm I guess not Do I have a weapon?

Interviewer: What kind of weapon do you want?

Me: Um something with range?

Interviewer: Like what?

Me: Uh a crossbow?

Interviewer: What kind of ammo do you have?

Me: Ice arrows?

Interviewer: Why?

Me: Because the prince of darkness is a creature made of fire???

Interviewer: Fine so what do you do next?

Me: I shoot him?

Interviewer: No what do you do?

Me:

Interviewer: You WASTE him! You *WASTE* the prince of darkness!!

Me: Holy crap, what have I gotten myself into?

What’s the funniest interview question you have been asked? Reply in the comments!

Highlights From Visiting Google

Forrest Iandola with University of Illinois Computer Science Students at Google in Chicago

This week, with a group of University of Illinois computer science students, I visited Google’s Chicago office. The Chicago location was founded primarily to house Chicago-based engineering superstars Brian “Fitz” Fitzpatrick and Ben Collins-Sussman, both of whom joined us at Illinois for the ACM Reflections|Projections Conference last fall.

Today, the University of Illinois group was greeted by Jessie Chavez, an early-stage partner in the startup FeedBurner, which was acquired by Google in 2007. Prior to joining FeedBurner, Chavez worked as a software engineer with a major financial institution in Chicago. In his transition from working in finance to working with FeedBurner, Chavez says, “My pay went down by 20%, but I was so much happier with my work.” When FeedBurner became part of Google, Chavez also joined Google where he says, “salary was no longer much of an issue.”

In addition to discussing his Google experience and offering advice for seeking internships, Chavez suggested reading Getting That Job at Google by Steve Yegge. Yegge does a fantastic job of outlining key concepts for Google technical interviews, with a double-dose of humor thrown in as well. Top on Yegge’s list, is to, “Study a data-structures and algorithms book. Many interviewers are happy when you understand the broad class of question they’re asking without explanation. For instance, if they ask you about coloring U.S. states in different colors, you get major bonus points if you recognize it as a graph-coloring problem, even if you don’t actually remember exactly how graph-coloring works.”

While understanding of core computer science concepts is paramount to conquering the interview, programming syntax should not be entirely disregarded. According to Yegge, “some interviewers are really picky about syntax, and some will even silently mark you down for missing a semicolon or a curly brace, without telling you. I think of these interviewers as – well, it’s a technical term that rhymes with ‘bass soles.'”

Before our visit concluded, University of Illinois students enjoyed a video conference with Jeff Moore, who manages engineering talent identification for several offices in the midwest and eastern United States. Moore builds on Yegge’s assessment of the interview process, saying that interviews explore how, “smart people apply coding knowledge, passion, interest, and raw horsepower to real problems.”

University of Illinois Newspaper Publishes Faculty Salaries

Earlier this week, the University of Illinois Daily Illini published a database of salaries for all University of Illinois faculty and staff. Top salaries included football coach Ronald Zook at $1,052,500.10 and Engineering Dean and Donald Biggar Willett Professor of Engineering Ilesanmi Adesida at $309,466. The range of salaries also included Special Education Clerical Assistant Vikas K. Singh at $7,903.48 and Visiting Lecturer of Dance Denis Chiaramont at $6,468.84.

Although the Daily Illini article offered little text or opinion content, the article was quite inflammatory. One online commenter posted, “Understanding to the penny how much each and every person in your own office makes is a demoralizing and humiliating experience. It creates jealousy and hostility in an already trying work environment, and can sour even the best of friendships.” However, the Daily Illini did little more than to neatly organize publicly available data in to a searchable database. In fact, University of Illinois salaries have been available here since long before the Daily Illini created the database.

The Daily Illini salary database is just one of many instances where organizing publicly shared data sparks significant public attention. Last year, Caltech computer science graduate student Virgil Griffith looked up the most popular music on Facebook profiles of students at each United States university. Then, Griffith compared the 133 most popular music artists on his list with the average incoming SAT scores of the universities where the music was popular. Griffith compared music artists and SAT scores in a graph called Music That Makes You Dumb.

The graph suggests that students who like Beethoven are likely to have scored above 1350 on the SAT, and students who listen to Lil’ Wayne may have scored less than 900. While Griffith did little more than to organize freely available data, Music Makes You Dumb garnered attention from the Wall Street Journal, The Washington Post, and The New Yorker.

Beyond salaries, SAT scores, and music preferences, there are countless public online resources that could be organized and published as databases or graphs for increased public attention. For instance, driving records available for Champaign County, IL are available for free in an online database. While the database is a bit crude and inconvenient to use, it is still easy to find driving records for any of your friends in Champaign County.

For instance, on November 23, 2009, University of Illinois accounting professor Ira Solomon was issued a $75 ticket for driving his 2000 Volvo at 11-14mph above the speed limit at the corner of Lincoln and Stoughton–just a block from where I live–Urbana, IL. Luckily, with a Daily Illini-reported salary of $294,230.16, I doubt Prof. Solomon was particularly distraught over his $75 ticket.