Research Advice

When I was getting started as a researcher, I had a habit of cornering the most proficient researchers in various fields (ML, quant, physics, etc) and asked them what they do differently from other people. This is what I found works well, ranked roughly by usefulness, a mix of their advice and my own.

  1. Run as many experiments as you can
    1. This is consistently the #1 advice that everyone gives
    2. "If you can run 2x as many experiments, you will be more than 2x as successful." In other words, there are increasing marginal returns
    3. You want to build an intuition around what works, and why
  2. Focus on iteration speed
    1. Reduce the time it takes to run an experiment
    2. The faster your turnaround time is, the more experiments you can run (advice #1) and more importantly, you increase the number of sequential (as opposed to parallel) experiments you can run
    3. Longer iteration cycles means experiments are more costly, so you would need better intuition around what might work, but also contradictorily makes it harder to build intuition
  3. Notice patterns and generalize
    1. Example: if you notice you need to apply special casing, try to find a more general rule that explains it
  4. Work really hard
    1. "You get increasing marginal returns, so the 70th hour matters more than the 10th hour you worked this week"
  5. Don't work too hard
    1. "Overworking yourself makes you sloppy and you get decreasing marginal returns"
    2. This directly contradicts #4; I think it just depends on the person
  6. Keep an experiments journal
    1. Write down your hypothesis, procedure, final results, analysis, and next steps
    2. Write down every single idea you have, and regularly rank them
    3. Summarize what you did at the end of the week
  7. Always ask yourself "Is this the most important thing I could be working on?"
    1. Your success is a function of magnitude (how hard you work) and direction (what you decide to work on). If you pick a bad direction, you will have to work much harder; it is worth spending some time thinking about what you are doing.
  8. Start at the data and work your way backwards
    1. Where does your model go wrong, and why? How can you fix that?
  9. Keep diffs (PRs or commits) modular
    1. Every experiment should only be one change, so you understand what actually caused improvement
    2. You want to prevent a bug in one change from blocking other changes
    3. This makes PRs easy to glance over and understand (good for reviewers!)
    4. This is better for collaboration, and means that others can build on your smaller changes instead of waiting for all of them to happen at once
  10. Bias towards simplicity, only allow complexity where it is warranted
    1. This prevents bugs + makes future iteration much easier
  11. Cache partial work
    1. This allows for faster iteration cycles (instead of running an experiment from scratch, you start off at your change)
    2. Note: you need to be really careful when caching, else you might end up throwing away lots of work
  12. Don't go in circles, know when to quit
    1. If you find yourself running the same experiment over and over, you are probably doing something wrong
  13. Build a rigorous benchmarking tool
    1. Have a uniform report that tells you "should I accept this change?"
  14. Watch other people's experiments
    1. This is a way of artificially increasing #1 (run as many experiments as possible) — watch successful and failed experiments, and use them to build intuition around what might work
    2. How can you improve their experiments? (Especially the failed ones!)
    3. This only works in certain environments, where people share experiment write-ups; the person who told me this comes from a team that religiously documents experiments
  15. Be skeptical of your own results
    1. Be brutal in evaluating your own results, and think about why they might be wrong or misleading
  16. Be obsessed with increasing your own productivity
    1. If you are doing the same thing over and over, can you automate it?
    2. Are there any tools that would make you faster?
  17. Don't ignore code warnings
    1. Warning messages are a sign that you have a bug
    2. If you know the warning is wrong and there is not a bug, suppress it so you don't start ignoring warnings

related