Here is described the process of technical interviewing and what competencies we aim to assess that Pluralis requires of its technical staff.
Because the first hires are important for setting the standard of hiring, we will be conducting 4 technical interviews each one 1 hour long.
Without any exception in all the interviews we should aim to go low level and make the person “do” (i.e. solve/derive or code something) and not only talk their way through the interview. For coding this is obviously easier to achieve. But examples of this will be given in sections where it is less clear.
There is no section in any interview where we discuss the candidate's research. I personally always find this a waste of time because it is rare that we have the same level of knowledge the candidate has on the topic and it is hard to properly judge.
Note on resources during the interview: The use of language models as helpers is not allowed. For the coding the candidate should share the screen and they are allowed to google and go to stack overflow. The candidate should not be allowed to google or use anything for the theoretical portions.
The competencies we aim to assess are split into practical and theoretical.
Practical:
- Model parallel - and hands on deep learning. (Gil)
- Coding, data structures and algorithms (Yan)
- For coding we will provide the interviewee before the interview with an ssh key that they can connect to an instance. They can choose whatever IDE they are comfortable using and attach to the instance for solving the problems we provide. The purpose is to see how they work in their comfort zone, debug and get the code actually working and solving the problem.
Theory:
- Math - Linear algebra, Probability, ML/DL Theory (Sameera)
- Optimisation, ML/DL Theory (AJ)
- The theoretical portion should be 50% conversational and 50% hands on solving. Here the expectation is to whiteboard and solve a problem, or discuss/derive equations. Due to us doing remote, we need to choose a shared notepad we can type on with the candidate. Such as: https://ankurm.com/notepad.app/. It is OK to have overlapping topics in this portion i.e. ML/DL theory can have overlap and it's fine to get extra validation signals on this part.
-
Coding, data structures and algorithms. (Yan)
- The level of coding we’ve encountered so far requires a good understanding of object-oriented programming in python (inheritance, abstractions etc.) and using parallel programming paradigms such as multi-processing, multi-threading and asynchronous procedures.
- Questions should be asked so that fundamentals of data structures and algorithms could also be assessed. For example, structure questions so the time-space complexity could be discussed.
- Examples of questions: 1) Ask the candidate to code a distributed dataloader, where a dataloader sits on a “server” serving clients that invoke get_item function. Can have different versions of this question that test different parallel techniques. Try to use a known design pattern and create a problem around it. 2) Provide the candidate with a complex written code (hivemind for example), put some bugs in and ask the candidate to get the code working.
- Scoring of this competency should take into account: 1) Knowledge of python language and proper use of OOP 2) Knowledge of data-structures and algorithms 3) How python basic components (i.e. dicts, queues, list) relate to formal data structures and if used correctly in the code. 4) Knowledge of parallel programming paradigms. 5) Working code 6) How well the candidate controls and uses the IDE to debug and work. 7) Cleanliness of code.
-
Model parallel - and hands on deep learning: (Gil)
- In this portion the different model parallel techniques will be asked. Conversation will be short to see if the candidate knows the basics.
- The main portion of this interview is coding that will test the candidate ability to implement from scratch a simple model that trains on given data. Following this, the next stage will get the candidate to take their code and turn it into a model parallel version, for example PP.
- Scoring will take into account: 1) Knowledge of model parallel methods 2) Ability to implement all the basic components for training a model 3) Ability to implement model parallel technique 4) Overlap with coding scores such as knowledge of python OO
-
Math - Linear algebra, Probability, DL/ML Theory (Sameera)
- Linear algebra cover topics like matrix factorization, (SVD, PCA), eigenvectors and eigenvalues, relation to signal processing and basis functions. condition number, relation to optimisation. matrix factorisation relationship to neural network compress and pruning.
- In probability theory, discuss estimation theory, what it means to have an estimator optimal in Least-Square sense, Maximum Likelihood or MAP. Have an understanding of bayes theorem and how these different estimators affect estimation bias.
- ML theory, anything that you feel comfortable assessing.
- Examples of notepad questions: 1) Derive the least square solution for given (y,x) sample set and linear relationship. (Essentially derive the pseudo-inverse matrix) 2) How is PCA related to the covariance matrix and how is it used to decorrelate an N dimensional gaussian vector? 3) Derive an LS, ML or MAP estimator for a simple case like we have a place and time samples of moving ball at constant speed with a noisy measurement tool, like y=vt +n, n~N(0,1). 4) Ask to derive the probability expressions for the monty hall problem. This will test their working knowledge of bayes theorem.
- Scoring will take into account 1) Knowledge of the given competency 2) Ability to effectively express his ideas and problem solving on whiteboard clearly and correctly.
-
Optimisation, DL/ML Theory (AJ)
- Optimisation - SGD, what is momentum, what is Adam doing differently, Nestorov, RMSprop, AdaGrad. Relation to adaptive learning rates, and bias correction. Second order methods like Hessian-free or conjugate gradient.
- Deep learning theory, anything you think relevant and feel comfortable asking that will help you assess the candidate. Can be different loss function, variational bayes, transformers architecture, how does positional encoding work etc.
- Examples of notepad questions: 1) ask the candidate to write optimization equations, sgd, momentum, adam etc. If they don’t remember you can provide - the point is to show the candidate can analyse and explain the components and how they effect parameter update. 2) You can ask the candidate to derive the chain rule and partial gradients for a network you define.
- Scoring will take into account 1) Knowledge of the given competency 2) Ability to effectively express their ideas and problem solving on whiteboard clearly and correctly.