How To Get A Job In Data Science

Data Science Prep, the best interview prep for data science jobs.

A data scientist working in analytics can be expected to work closely with product teams in order to drive a product forward. There are several fundamental skills that are often needed in order to carry out such a role, and these are often broadly tested in interviews across a variety of companies. This post gives a broad overview of some tips and tricks for the overall interview process.

General tips

It is essential to get a good sense of the roles and responsibilities of the specific position at hand, as well as the product(s) you might be working on. This is because interviewers will likely ask you problems that are based on work and projects they have done or are currently working on. As such, you can ask yourself the following types of questions:

  • What kind of business is the company and what stage are they at? Depending on the answers to these questions, different types of metrics, and consequently, ways of analyzing data will matter. The book “Lean Analytics” by Alistair Croll and Benjamin Yoskovitz is a great resource for this: http://leananalyticsbook.com/
  • How does the product fit into the broader ecosystem of the company’s suite of products? What are some things I like and dislike about the product? What new features should the product have? These questions are ones that might generally be found in product management interviews as well since the job function there is to mostly think about product questions. Some helpful resources here are “Cracking the PM Interview”: http://www.crackingthepminterview.com/ and Daily Product Prep: https://dailyproductprep.com/
  • Which key performance indicators (KPIs), i.e. essential business metrics, would I measure if I were working on this product? What factors and variables influence those metrics and how do they influence them? Metrics are specific to each company, but there is a lot of overlap based on business models and how the company generates revenues. YC has a great video covering metrics based on the business model: https://www.startupschool.org/videos/66
  • How could or does the product monetize? Here the key is thinking of various levers in products and processes that drive the core business. Keith Rabois has used the term “business equations”, which is a helpful concept. An example of his usage is: http://growth.eladgil.com/book/chapter-8-financing-and-valuation/going-public-why-do-an-ipo-an-interview-with-keith-rabois-part-2/

Doing your research before interviews will signal that you’re interested in the role and product at hand. Additionally, you can probably start imagining what types of questions you might get asked — at a company like Facebook, the conversations might be focused on users, social graphs, and content, whereas at a company like Uber, the conversation might revolve around riders, ride-share algorithms, and pricing. Lastly, it is important not to overlook the general tips involving: polishing your resume, having a portfolio of projects, etc.

Product Sense

The primary function of this type of interview question is to get a better sense of how to think about measuring and improving the product. The goal is to assess how you think about product decisions and the data involved. They can be open-ended (which is more common, and asks about what metrics to track and how to optimize them), or more detailed and technical (relating to what kind of data you might track and how that data might influence metrics).

Sample questions:

  • Let’s say that you are the first person working on Facebook Messenger. What KPIs would you track? How would you measure and improve those KPIs?
  • Say you’re at Youtube and have two new ideas to implement to drive revenue. How would you decide which to implement and why?

Probability and Statistics

These are core topics that are tested to ensure numerical literacy and are crucial to success in a data science role. Often, the probability topics will be limited to something you might find in a Statistics 101 course: basic distributions, conditional probabilities, and concepts like the Central Theorem and the Law of Large Numbers. For the statistics topics, topics like hypothesis testing, statistical significance, and other topics relating to statistical inference are commonly asked, since A/B testing is a core part of experimentation in many companies. There are many textbooks for these types of topics (A First Course in Probability by Sheldon Ross for example) as well as online courses covering this content.

The level of technical detail asked for these questions may vary depending on the type of role. For example, if the role is specifically on an experimentation team dealing with A/B testing and large-scale inference problems, chances are that the questions you receive in this category will involve much more math than someone working on the pure product side. Such math would involve the technical details behind distributions, statistical tests, optimization, etc. On the flip side, many companies will also ask questions about explaining these technical concepts in layman terms. This is to test your understanding and communication ability on more complex topics, which is useful for them to gauge because as a data scientist you will often be interacting with non-technical folks in other divisions.

Sample questions:

  • Roll two dice. Given that the max of the two was 4, what is the probability that the sum exceeds 6?
  • Explain what p-values and confidence intervals are to a layman.

Coding and SQL

Although the main role of a data scientist is different from software engineering, data scientists can be expected to have decent proficiency for coding and computer science. The most popular languages used are Python and R. Additionally, SQL questions are common in interviews, given that they are more straightforward and most data science roles will involve querying large datasets in some capacity.

Common algorithms and data structures covered will involve and arrays, streams, graphs, and matrices since they are often used in real-life applications for products. The same principles that apply to software engineering apply here: always starting off with a brute force solution and slowly optimizing, with a focus on worst-case runtime and memory usage. LeetCode (https://leetcode.com/), Cracking the Coding Interview (http://www.crackingthecodinginterview.com/) and Daily Coding Problem (https://www.dailycodingproblem.com/) are all great resources for practicing coding problems.

Similar to coding problems, SQL questions will often have structures seen in real companies since they are interested in collecting particular kinds of data on users, transactions, etc. For example, Uber and Lyft may ask questions regarding when a user took their last ride, busiest times, or other questions regarding the ride-share experience, whereas Facebook may ask about the user and friendship graph.

Sample questions:

  • Given a table with products sold over time: date, product_id, sales_amount, calculate the cumulative sum for each product over time in chronological order.
  • Given an array of integers, find the first missing positive integer. For example, if given [0, 2, 1] return 3, if given [-1, 4, 1, 3] return 2.

Conclusion

Hopefully, this article gave you a nice glimpse at the interview process for tech analytics roles in data science. If you are interested in more questions (and answers), make sure to subscribe!

Are you interviewing for data science jobs or are you trying to hone your data science skills? Check out our newsletter, Data Science Prep, to get data science interview questions straight to your inbox.

Ace your data science interview

Start thinking like a data scientist by solving real interview questions in your inbox!