Data has become increasingly crucial for businesses around the world. And with databases filled with petabytes of data every week, it’s critical that companies hire data scientists to analyze everything for crucial insights.
Data scientists are in high demand. It’s not an easy job, however, and it takes a certain mindset to succeed in the role. We spoke with several experts to find out what it takes to become a data scientist and what you can expect during an interview with a data scientist.
What are the essential qualities of data scientists?
Kate Druckman, head of data science for a large fintech company, tells Dice that successful data scientists have a few key characteristics: “Analytical rigor and statistical skills, strong SQL/Python coding skills, and storytelling/ communication”. (Druckman is the pseudonym of a data scientist who preferred not to use his real name.)
Adam Suganoexecutive director of data analytics at the University of California, Los Angeles (UCLA) adds: “Data science is an ever-evolving field, with new tools and technologies being introduced every year that require workers in this field to constantly learn. »
Curiosity is an essential quality in data scientists, says Sugano: “Not only do they enjoy the learning process and absorb new knowledge, but they immediately turn around and start thinking about how this new tool , method, data domain, etc. can be applied to the range of problems they have been asked to solve.
How can you show curiosity in your application materials? Sugano often seeks voluntary participation in data competitions or the pursuit of lifelong learning through platforms such as Datacamp. Highlighting your personal data projects or blogging about data science can also help highlight your passion for the field.
“Furthermore, a data scientist must know how to think about a problem,” Sugano continues. “I often see people on the business side asking data science teams questions that are necessary but not sufficient. The best data scientists don’t just take commands, but accompany the questioner, striving to understand their world so they can help frame both the problem and the question in a way that leads to new insights. best results all around. This skill is almost impossible to detect just by reading a CV, but can be identified by insightful questions in an interview process.
What stands out from a data scientist CV?
Knowledge of statistical methodology is fundamental when it comes to preparing your application file. “Too many people call themselves data scientists just because they’ve completed a four-course course on Coursera or completed a 12-week Python bootcamp,” Sugano says. “Don’t get me wrong, these are good starting points, but just because someone lists a Kaggle project on their resume where they used their favorite machine learning algorithm here doesn’t mean they actually know what this algorithm is doing behind the scenes.
In other words, it’s more than just calling a predictive modeling function in R or Python; you have to know Why you are doing something, as well as how to interpret the results. Knowing the limitations of a tool or model is also essential. According to Sugano, “People trained in statistics can not only call the functions that run the algorithms, but they also know how to properly prepare the data for the model being used, how to tune the model for even better performance, and can answer direct questions. how the predictions were generated and/or what the predicted values mean.
Jean FordiceAnalytics Lead at Bonsai, agrees: “The candidate must be able to express their passion for data science.
Druckman adds, “Candidates with multi-industry experience, interdisciplinary background (math, statistics, computer science), strong computer science background” as of particular interest to many organizations.
What questions can you expect when interviewing a data scientist?
Druckman, with Abhinav Unnam (Senior Data Scientist at Aviso AI) and Benn Stancil (Co-Founder and Chief Analytics Officer at Mode) suggest a few questions you should expect to be asked in a data scientist job interview:
- Python coding test, which usually uses the concept of lists, dictionary, etc. :
- Find all combinations of strings in a specific URL made up of strings that meet specific requirements.
- Scheduling algorithms for the total time spent through a series of overlapping time intervals. Take the union of the time.
- machine learning case interview:
- Solve a problem statement from start to finish.
- Define the problem statement, find the solution.
- Explain it in simple terms in terms of metrics; why these and how to measure?
- How would you help our sales leadership team decide if the sales team is the right size?
- How to measure the impact of a billboard?
- How would you help an Airbnb host decide the right number of photos to post on their profile?
- What is P-value in simple terms?
- Type 1 and type 2 error: explain in simple words.
- How to convert wide data frame to long data frame and vice versa in SQL and Python.
- What is XGB and why is it effective?
- What is a random forest? How is feature importance calculated?
- What is Logistic Regression? How is maximum likelihood used?
- Code a logistic regression model from scratch using OOP.
- Describe to me a project that you led from its inception to its commercial impact, step by step.
Interviews can be particularly difficult with some hiring managers and data scientists, especially if the job itself is ultra-specialized. “All of my questions are tailored to the individual through a combination of the specific nature and needs of the position and the specific skills and experiences a candidate lists on their resume,” Sugano says. “Additionally, I find it beneficial to give candidates take-home assignments with real data that they can manipulate and analyze.”
This type of process, he adds, “better reflects the real world where workers have Google Search, Stack Overflow, etc. at their disposal, instead of expecting them to know the answer to a limited set of programming, statistics or probability questions (if there are 100 bulbs in a row and…).
Communicating your results is also extremely important; When you sit down with the recruiter and hiring manager, be prepared to explain to them your logic behind solving problems a certain way. A big part of a data scientist’s job is to present data for analysis by multiple stakeholders, including executives.
Are there any online tools data scientists can use to prepare for an interview?
“Yes and no,” Stancil said. “There are many tools for sample technical questions and many online tutorials for learning technical languages. These tools are useful, and for many interviews, I think they help.
But for highly specialized data scientist roles, these platforms may prove less useful. “The best preparation is trying to solve a problem with data,” Stancil adds. “It doesn’t have to be a big deal, but being able to talk about those experiences, the problems you’ve had and how you’ve tried to solve them is far more helpful and impressive to me than someone ‘one who can rattle off a list of predictive models with which they are familiar.
Druckman encourages Data Scientists to work on “HackerRank, Leetcode, Interview.ioAlgoExpert” and the seemingly endless YouTube channels available.
Fordice adds:Interviewkickstart.com is a great resource with a six-week course for data scientists.
Sugano notes that if you really want to ace the interview, researching your potential employer can yield great results: “Data scientists should do their research from the perspective of really trying to understand a company’s business model. business and anticipate how that business already is or should be leverage data to improve business decisions Ask questions about all of a company’s data assets and how they are leveraged today “Today, as well as coming up with potential new applications for their use, is a way for a data scientist to stand out by showing strong business interest and business insight.”