Kislay KeshariKurt is a Big Data and Data Science Expert, working as a...Kurt is a Big Data and Data Science Expert, working as a Research Analyst at Edureka. He is keen to work with Machine Learning,...
Getting hired by a Globally Renowned Company like Google is a Dream Job for a lot of people. They have has some of the most talented AI Research Scientists, Data Engineers, and Data Scientists in the world. There are not many sources for Google Data Science Interview Questions online and it is not easy to get a job there. Enroll for theData Science with Python course by Edureka to elevate your career.
So, I’ll be covering the following topics in this article:
With an average salary of $169,067, including bonus. A Google Data Scientist’s Salary ranges from $120,000 – $280,000. With this high salary, you need to know the right requirements for the Job you are applying. Although the requirements vary from position to position, Below are some of the common ones:
Master’s Degree in Quantitative Discipline (Statistics, Operations Research, Computer Science)
2 years of work experience in Data Analysis related field
Experience with statistical software (e.g., R, Python, MATLAB, Pandas) and
Work with large, complex data sets. Solve difficult, non-routine analysis problems, applying advanced analytical methods as needed
Conduct analysis that includes data gathering and requirements specification, processing, analysis, ongoing deliverables, and presentations
Build and prototype analysis pipelines iteratively to provide insights at scale
Develop comprehensive knowledge of Google data structures and metrics, advocating for changes where needed for product development
Interact cross-functionally, making business recommendations (e.g., cost-benefit, forecasting, experiment analysis)
Research and develop analysis, forecasting, and optimization methods to improve the quality of Google’s user-facing products
Google Data Science Interview Process
Clearing the shortlist is itself a tough task, which entirely depends on your CV, Cover Letter, and Experience. Google Data Science Interview Questions are a mixture of Brainteasers and Technical Queries. Usually, the first process is Telephonic Interview.
It consists of Questions mostly based on Probability (concrete and theoretical) and heavily based on Machine Learning. The questions also vary based on the projects you have worked on.
Case 1: The Interviews had asked about feature extraction techniques, PCA(Used in Projects), correlation analysis, some classification techniques that were used(SVM, GBM, neural net). Why not logistic regression, why GBM ?- Basically questions revolving around class separability.
Case 2: Why use feature selection? If two predictors are highly correlated, what is the effect on the coefficients in the logistic regression? What are the confidence intervals of the coefficients?
Case 3: A disc is spinning on a spindle and you don’t know the direction in which way the disc is spinning. You are provided with a set of pins. How will you use the pins to describe in which way the disc is spinning?
After the Telephonic Interviews, it’s the Face to Face and Coding Rounds. So, Let’s Discuss some of the most common Google Data Science Interview Questions. Although these questions may not be asked exactly as given below, I’ve tried to cover a lot of them.
Google Data Science Interview Questions
These questions are not puzzlers, as Google has stopped asking those questions instead, they have similar questions which they call Problem-Solving Questions. A lot of Machine Learning Questions, all the way from generic to the practical ones are asked. Google Data Science Interview Questions basically covers the breadth of topics rather than Depth.
Q1. You are at a Casino and have two dices to play with. You win $10 every time you roll a 5. If you play till you win and then stop, what is the expected payout?
Q2. You are about to get on a plane to London, you want to know whether you have to bring an umbrella or not. You call three of your random friends and as each one of them if it’s raining. The probability that your friend is telling the truth is 2/3 and the probability that they are playing a prank on you by lying is 1/3. If all 3 of them tell that it is raining, then what is the probability that it is actually raining in London.
Q3. How would add new Facebook members to the database of members, and code their relationships to others in the database?
Q4. How will you test that there is an increased probability of a user to stay active after 6 months given that a user has more friends now?
Q5. You are given 40 cards with four different colors- 10 Green cards, 10 Red Cards, 10 Blue cards, and 10 Yellow cards. The cards of each color are numbered from one to ten. Two cards are picked at random. Find out the probability that the cards picked are not of the same number and same color.
Q6. Create a program in a language of your choice to read a text file with various tweets. The output should be 2 text files-one that contains the list of all unique words among all tweets along with the count for repeated words and the second file should contain the medium number of unique words for all tweets.
Q7. What will you do if removing missing values from a dataset cause bias?
Q8. A disc is spinning on a spindle and you don’t know the direction in which way the disc is spinning. You are provided with a set of pins. How will you use the pins to describe in which way the disc is spinning?
Q9. How will you design a recommendation engine for jobs?
Q10. What kind of product do you want to build at Google?
Q11. Cars are implanted with speed tracker so that the insurance companies can track about our driving state. Based on this new scheme what kind of business questions can be answered?
Q12. How can you decide if one algorithm is better than the other?
Q13. A box has 12 red cards and 12 black cards. Another box has 24 red cards and 24 black cards. You want to draw two cards at random from one of the two boxes, which box has a higher probability of getting cards of the same color and why?
Q14. What is the difference between a bagged model and a boosted model?
Q15. You are creating a report for user content uploads every month and observe a sudden increase in the number of upload for the month of January. The increase in uploads is, particularly in image uploads. What do you think will be the cause for this and how will you test this sudden spike?
Q16. You own a clothing enterprise and want to improve your place in the market. How will you do it from the ground level?
Q17. How will you decide which versions of the two of the Surge Pricing Algorithms is working better for any Aviation Company?
Q18. What is the degree of freedom for lasso?
Q19. What is the difference between an iterator, generator and list comprehension in Python?
Q20. Given a set of webpages and changes on the website, how will you test the new website feature to determine if the change works positively?
Q21. Given an MxN dimension matrix with each cell containing an alphabet, find if a string is contained in it or not.
Q22. How will you build a caching system using an advanced data structure like hashmap?
Q23. If you could get the dataset on any topic of interest, irrespective of the collection methods or resources then how would the dataset look like and what will you do with it?
Q24. What are anomaly detection methods?
Q25. How does caching work and how do you use it in Data science?
So guys, with this we come to an end to this article. Google Data Science Interview Questions are mostly scenario based and require you do have Problem Solving abilities and moreover you need to know how to apply Data Science to these situations. I hope this will give you a perspective to be prepared for any Data Science Interview in the future. Be it Google, Microsoft, Apple or Uber. All the tech Giants ask similar types of Questions when it comes to Data Science as it is a vast and at the same time a new field.
Check out this video to prepare for Data Science Interview.
Data Science Masters Program makes you proficient in the tools and systems used by Data Science Professionals. It includes training on Statistics, Data Science, Python, Apache Spark & Scala, Tensorflow and Tableau. The curriculum has been determined by extensive research on 5000+ job descriptions across the globe. If you have any queries, feel free to mention in the comment section below.
Also, If you are looking for online structured training in Data Science, edureka! has a specially curated Data Science Course that helps you gain expertise in Statistics, Data Wrangling, Exploratory Data Analysis, and Machine Learning Algorithms like K-Means Clustering, Decision Trees, Random Forest, and Naive Bayes. You’ll also learn the concepts of Time Series, Text Mining, and an introduction to Deep Learning. New batches for this course are starting soon!!
Upcoming Batches For Data Science with Python Certification Course