Since Data Science was proposed as the sexiest job of the 21th century in 2012, a lot of people from all kind if different fields started to move to data science or related machine learning roles. Solving complex problems with fancy artificial intelligence algorithms and a good pay sounds attractive. A lot of companies jumped on the hype train and now offer boots camps to learn data science/AI/ML in less than one year. Recently I was asked, why its so hard to get a job in this field and if a bootcamp is enough to get a decent job. Here are 10 things to consider before joining such a bootcamp or heading for a career transition.
1. Job titles are not yet well defined, required skills vary a lot
After startups noticed that artificial intelligence is now a powerful buzzword to be funded, they started to rename existing job offers from data analyst/statistician to data scientist. The job title sounds sexier, so they get more applications for the job postings. But if you read the job posting, you notice that these roles are completely different. Some want business analysts, answering questions with SAS, SPSS. Some want data engineers building big data hadoop systems and some want deep learning researchers using Tensorflow and neural networks, but they might call them all data scientists. But really, all these types are very different and require different skills. Choose one and of these roles.
2. There is no shortage of juniors
As already mentioned, a lot of people want to become a data magician. Not only computer scientists, physicists and mathematicians, but also economists, psychologists and other natural scientists with quantitative background. The problem is, that most of the companies are not looking for fresh graduates, some do not even know, for what they are looking. Some might expect to hire one data scientist and solve all their problems. And because they do not really understand the requirements, they hire fresh undergraduates or bootcamp grads, having all the buzzwords on their CV. 85% of data initiatives fail probably, one reason might be because of that. Furthermore, according to techrepublic, the demand for data scientist is already starting to shrink.
There might be a skills shortage, but not an applicant shortage. It’s not unusual for entry-level or internship openings in data science to receive hundreds of applicants. When employers talk about shortages, they’re generally talking about a lack of experienced professionals. Glassdoor senior economist Daniel Zhao
3. It is hard without academic education
The idea to get a data job without any academic education is daring. It may be possible if you are a genius or lucky, but in general you will hardly get an interview call. Artificial intelligence is about statistics and math and usually these two are the hardest parts in the studies. And people tend to avoid these topics, if they can. Back in my computer science studies, fellow students tried to avoid all the machine learning stuff, because it was hard to get a good grade, they chose something easier instead. You might not need all of it, but usually you will not be the only applicant and you compete with people with phds. All these MOOCs and bootcamps cannot teach you the fundamentals in few months, you need more time. Read the job postings and you will notice that mostly masters or even phds is a plus, depending on the roles. With that in mind, its hard but its not impossible.
88% have at least a Master’s degree and 46% have PhDs. kdnuggets
4. Applied machine learning is exhausting
Kaggle challenges and university courses have one in common, which is not true in industry: A data set is available and prepared. To learn exploration, preprocessing and modelling it makes absolutely sense, but a huge part of the work is to get to this point. Machine learning is rewarding if it delivers value, but it takes you a lot of observation and experimentation until you get good results. If you are a perfectionist and your frustration tolerance is low, don´t go for applied machine learning, it will make you mad.
5. Deep Learning is not used often in the industry
Neural networks made artificial intelligence popular in the last years, but they have several drawbacks. They are hard to train and to architect, they need a lot of time to tune and they are prone to over-fitting and very computational intense. If you want to use neural networks, don´t head for a career as data scientist in the industry. There are very few companies using neural nets, because it’s too much magic and in many cases traditional methods are good enough. If you want to use deep learning focus on academia and research.
6. Perception of AI is wrong
Artificial neural networks are inspired by brains, but they are very far away from it. I don´t see any AI competing with the human. The perception of AI in public and in science is quite different. The problem is that it is hard to explain, why AIs can play DOTA 2, make deep fakes or compose music and is still not smart. What seems to be forgotten, is that AI is still pattern recognition and it fails pretty fast if some pattern change. It does not understand, it does not think and it does not dream. You will be asked, why your AI system can not do XYZ and you will properly not be able to fix it. Now explain your boss, why AI can defeat world champions in GO, but can not learn how to predict some easy XYZ thing.
7. Lots of AI is actually not AI
Recently there was a research about European AI startups. They basically found that 40% of AI startups are not using AI at all. Some even just hired humans to fake AI. The reason for that is quite easy. AI systems require data, time and people to build it, which is expensive. Sometimes it’s easier and cheaper to let humans do the work. Don´t be that “labeling things” guy, how is just there, to proof that your startup has AI expertise. Be skeptical about data science job postings.
8. Lifelong learning as philosophy
Spark, TensorFlow, keras, scikit-learn, pandas are tools, which makes your life easier. These tools change, they are replaced by better tools or they stay forever, who knows. But they are just tools. You should not focus too much onto those tools, focus on techniques and problem solving. If you love keras, but PyTorch solves some problem better, learn PyTorch. You will notice that the idea behind these tools is often very close and they work similar. Same for programming languages. Don´t be that guy that uses C++ to prototype ML models, because he was too proud to learn Python, a scripting language. Be open minded.
9. Domain matters
Machine learning is about data. Data is about domain. Understanding the domain is necessary to understand the data. The idea that a data team can solve any problem with data and without domain expertise is dangerous and will not work. There are so many hints in the data, which only can be understood if you know, how the domain works and furthermore how the processes work. Not just the business view, but also the technical view. Playing around with techniques is not enough.
10. Critical thinking personality
Critical thinking is one of the most important skills. A lot of projects are successful only because someone questions the current approach or objective. Is the target variable really, what we want to predict? Do we really need machine learning here? Do we spend one week more to get 1% more out of it? Can we really trust that data? Is is a self fulfilling prophecy? Asking these questions is quite hard, because often we don´t like the answers, but it just necessary!
Disclaimer: You might notice, that I am not a data scientist, so all the things are biased to the machine learning engineer role, but I did a lot of research to find opinions from others. If you are really interested in machine learning and data science, I am the last one who want you to stop, but don´t believe promises from consulting companies, who offer boot camps. Don´t do it because it is hyped, remember all hypes end at some point. Following the masses is never a good idea.