Research Advice
When I was getting started as a researcher, I had a habit of cornering the most proficient researchers in various fields (ML, quant, physics, etc) and asked them what they do differently from other people. This is what I found works well, ranked roughly by usefulness, a mix of their advice and my own.
- Run as many experiments as you can
- This is consistently the #1 advice that everyone gives
- "If you can run 2x as many experiments, you will be more than 2x as successful." In other words, there are increasing marginal returns
- You want to build an intuition around what works, and why
- Focus on iteration speed
- Reduce the time it takes to run an experiment
- The faster your turnaround time is, the more experiments you can run (advice #1) and more importantly, you increase the number of sequential (as opposed to parallel) experiments you can run
- Longer iteration cycles means experiments are more costly, so you would need better intuition around what might work, but also contradictorily makes it harder to build intuition
- Notice patterns and generalize
- Example: if you notice you need to apply special casing, try to find a more general rule that explains it
- Work really hard
- "You get increasing marginal returns, so the 70th hour matters more than the 10th hour you worked this week"
- Don't work too hard
- "Overworking yourself makes you sloppy and you get decreasing marginal returns"
- This directly contradicts #4; I think it just depends on the person
- Keep an experiments journal
- Write down your hypothesis, procedure, final results, analysis, and next steps
- Write down every single idea you have, and regularly rank them
- Summarize what you did at the end of the week
- Always ask yourself "Is this the most important thing I could be working on?"
- Your success is a function of magnitude (how hard you work) and direction (what you decide to work on). If you pick a bad direction, you will have to work much harder; it is worth spending some time thinking about what you are doing.
- Start at the data and work your way backwards
- Where does your model go wrong, and why? How can you fix that?
- Keep diffs (PRs or commits) modular
- Every experiment should only be one change, so you understand what actually caused improvement
- You want to prevent a bug in one change from blocking other changes
- This makes PRs easy to glance over and understand (good for reviewers!)
- This is better for collaboration, and means that others can build on your smaller changes instead of waiting for all of them to happen at once
- Bias towards simplicity, only allow complexity where it is warranted
- This prevents bugs + makes future iteration much easier
- Cache partial work
- This allows for faster iteration cycles (instead of running an experiment from scratch, you start off at your change)
- Note: you need to be really careful when caching, else you might end up throwing away lots of work
- Don't go in circles, know when to quit
- If you find yourself running the same experiment over and over, you are probably doing something wrong
- Build a rigorous benchmarking tool
- Have a uniform report that tells you "should I accept this change?"
- Watch other people's experiments
- This is a way of artificially increasing #1 (run as many experiments as possible) — watch successful and failed experiments, and use them to build intuition around what might work
- How can you improve their experiments? (Especially the failed ones!)
- This only works in certain environments, where people share experiment write-ups; the person who told me this comes from a team that religiously documents experiments
- Be skeptical of your own results
- Be brutal in evaluating your own results, and think about why they might be wrong or misleading
- Be obsessed with increasing your own productivity
- If you are doing the same thing over and over, can you automate it?
- Are there any tools that would make you faster?
- Don't ignore code warnings
- Warning messages are a sign that you have a bug
- If you know the warning is wrong and there is not a bug, suppress it so you don't start ignoring warnings