The Disconnect Between Industry Deep Learning and University Courses
As a company that helps other companies build their in house machine learning talent, we've heard a lot of requests to discuss what it takes from a candidate to get a job at one of our customers. This post will discuss the current state of machine learning education specifically in the context of helping people succeed in industry engineering jobs.
There are currently two main avenues for structured learning: university classes and Massive Online Open Courses (MOOCs). There also exists a third option to simply learn from open source documentation and online code tutorials. This is often better suited for people who already have a background in machine learning and are more interested in learning a particular application, such as image recognition, or a specific deep learning package, such as Tensorflow, CNTK, or PyTorch.
The value of theory versus practice in university computer science classes has been hotly debated for a while. Many schools now offer "Software Engineering: Design & Implementation" classes that focus specifically on helping students build software engineering projects in a team environment. The classes teach industry-standard engineering practices during lecture and the majority of the grade is determined by the quality of the projects completed. Tests on theory are trivial or non-existent. These types of classes are well received by students, and many signups to get experience for industry jobs or build projects to include on their resumes when recruiting. The trend is abundantly clear: a majority of computer science students want to become engineers and work in the industry, not pursue research or theoretical work in graduate school or beyond.
However, despite schools offering many practical software engineering, UI/UX design, and even data analytics courses, most university machine learning courses remain theoretical, especially courses concerning deep learning topics. Students are expected to have strong math foundations and are often tested on proofs or concepts rather than their ability to code models and apply them to real-world datasets. Even coding assignments tend to reduce practical coding value by providing curated datasets, narrowing the scope of the problem to implementing certain functions or failing to consider model tuning and performance. Considering data preprocessing, feature engineering, and efficient model deployment make up a majority of a machine learning engineer's time, this is a painful oversight when it comes to preparing students for industry machine learning.
Finally, the most obvious flaw in using university courses to prepare for industry machine learning is inaccessibility. University is expensive and classes take several months to complete. It is unrealistic for anyone other than an already-enrolled student to rely on university courses to arm them with the practical skills they need to land and thrive in a machine learning job.
Massive Online Open Courses (MOOCs)
MOOCs solve the university courses' accessibility problem by offering full classes to anyone with access to the internet. They are reasonably priced or sometimes even free. Classes consist of videos of professors giving lectures online and downloadable assignments that can be completed locally and submitted for grading. Some even have forums and teaching assistants for real-person help. However, at the end of the day, most deep learning MOOCs are still just university classes converted to an online delivery medium. The content remains mostly theoretical and it becomes even harder to effectively administer coding assignments that test the skills needed for industry.
It is undeniable that university courses and MOOCs teach significant machine learning content. The best courses provide a strong theoretical foundation for learners to conduct research and pursue graduate school.At the heart of the problem is traditional courses give delivery through lectures (whether in person or on video). Lecture-based classes teach theory and then assign code to supplement the lecture, so the assignments are unstructured and often unrelated to each other, which is the exact opposite of an industry engineering project. At the same time the disconnect between teaching style and content and the skills needed to become an industry engineer results in recruiters and hiring managers having to index even heavier towards experience and personal projects when evaluating a resume.
It's a well-defined notion in industry recruiting that having a machine learning accreditation or even a relevant degree does not directly translate to real-life engineering tasks. AdaptiLab's technical coding assessments evaluate candidates directly on the data preprocessing, data analysis, feature engineering, and model development challenges they'd face in their day-to-day jobs. The challenges have the added benefit of being delivered on synthetic datasets AdaptiLab generates to be relevant to the company's domain, so candidates are also assessed on domain knowledge and skillset alignment.