During an interview data scientists often fail to answer a question about their past projects: why did they decide to use the algorithm they used?
Often the answer is "I tried a whole bunch and this one worked best." :/
A good data scientist should be able to reason about how the structure of the data and the business problem fit different machine learning approaches!
- Does the data size allow for a complex algorithm?
- Which technique is best suited to the intended mode of deployment?
- What can be extended, replaced, or maintained easily as the project matures?
- Is the method supposed to be interpretable?
- What can be tested more quickly in a POC?
Of course trial and error is always part of developing models, but data scientists should strive to build the expertise to answer questions like this and be able to use their knowledge to narrow down the list of ML algorithms to try.
Comments