Successful PhD students, and academics at large, have several qualities. They’re naturally curious, independently motivated, detail oriented, obsessed with getting it right and excited to tell the world what they’ve found. Although we’re all acutely aware of exceptions - of disorganized introverts, of brilliant thinkers but lazy doers, etc. - those people are rarely successful.

If you don’t naturally come by them, you must develop habits that facilitate measurable progress every day despite the roller coaster of successes and failures in discovery, implementation, and marketing aspects of your work. Setbacks are inevitable, and recovering from those quickly and gracefully is at least as important as whatever else you might otherwise do when things are going well. That’s difficult. You need a plan.

The keys to developing good habits, and in particular to being successful as a member of my lab, are outlined below. Many of them are lessons I learned from others along the way, but inevitably some are reactions to pitfalls colleagues and students have fallen into and struggled to climb out of. You’ll find statments similar to this one all over academia. Here is another good one, for example.

General

Here are some lofty thoughts. More specific, every day things are coming next.

  • Emulate. Identify people you respect, their productive ways of working, and copy them. This isn’t about searching for confirmation bias: identifying people you think are like you and that have what you want and assuming that means you’ll get it. Instead, look for mentors (your advisor, senior PhD students, postdocs, young PhD-level researchers, senior researchers), identify skills that they have and you don’t, and acquire them. Don’t observe them using the Unix command line or editing in vi and dismiss them/that tool as old-fashioned. (See more about computing below.) Instead, endeavor to acquire this new skill. Make a plan become proficient, like them, and act on that plan.

  • Be poised. Don’t be jealous and don’t take others’ opinions of your ideas too seriously. At the same time, be aware that your work will be criticized. Respond respectfully, and take every opportunity to learn and improve. If your advisor or a referee says to do something, do it and do it well or go do something else.

  • Find your comparative advantage and exploit it. Your combination of skills is different than others’, but you might not know what they are yet. Find out what you’re good at, what you love and find rewarding, and do everything possible to be even better at those things. Hone your skills, but remain open to acquiring new ones. You don’t have to be the very best at each thing, or even best at anything in particular no matter how narrow. However, once you figure out your special mix, you’ll know that very few people are as good as you are at those things and you will find others who respect your ability to exploit their synergy.

  • Learning is just the beginning. Getting a PhD is about transitioning from the absorption to the creation and dissemination of knowledge. There will be aspects of that transition that are uncomfortable - things you will be asked to do that you aren’t good at yet, and cannot be avoided. Examples include coding, writing, giving public presentations, being aware of the wider literature. Make a plan to conquer those things, and act on that plan. That doesn’t mean be aware that you’ll need to do them at some point. You just read that sentence, so you’re already aware. It means begin taking steps now to get on top of the situation. More specific pointers are provided below.

  • Get in or get out of the way. Don’t give up on something you love doing, but also don’t be committed to a fault. If you can’t stand the process of writing, say, and although you’ve made great strides to improve it makes you feel miserable, maybe this isn’t for you. There is a huge opportunity cost to doing a PhD. Your education might seem “free” because someone else is paying the bill, but you are forgoing a decent salary doing Masters-level data science in industry. That’s real money not in your bank account because you’re toiling around here. Also, don’t succumb to the sunk cost fallacy. Just because you’ve come this far doesn’t mean you have to keep going. Don’t throw good money after bad. (Ok, enough econ metaphor.) Getting a PhD earns you the opportunity to conduct self-directed research in a professional setting. That sounds great. But if you don’t like many of the the things you’re doing toward that degree now, they aren’t going away and you aren’t going to like them any more later, nor any of the other things you’re largely protected from as a student (committee meetings, panels, referee reports, grant applications). PhD level study is a multi-faceted investment and your goal should be a balanced portfolio. (Finance metaphor now.) Either be all in, absorb and endeavor to improve in every way, or go do something else.

  • Get carried away. It’s important to protect holidays and weekends, time with the family and so on. Appreciate the finer things in life. But if you don’t go though bouts of obsession, where you feel compelled to figure something out, something that keeps you up at night or brings you to the office on the weekend, then this isn’t your passion. I’m not talking about pulling an all-nighter because there’s a deadline. Those things happen in every profession, but they’re to be avoided if at all possible. Many of the specific suggestions below address that goal head on. Be disciplined with your time. It’s much better, and healthier, to have a slow burn of progress than one of intermittent lull and panic. I’m talking about something you can’t avoid doing because it’s eating at you. Because you know that if you don’t do it, then nobody else will and that’ll be a shame. Because this is not just your profession, it is also your hobby. If science (i.e., statistics) doesn’t do that for you, then you’re in the wrong field.

  • Take pride. Don’t turn something in to your advisor, or submit something to a committee or journal that’s sloppy. You don’t have to be perfect all the time, but also don’t call time on something if you know you can do better. Fix it. Pull an all-nighter if you need to and next time start earlier so that doesn’t happen again.

Be present and organized

I shouldn’t have to say this, but I do.

  • Come to the office every work day during normal business hours. There is an important group/cohort dynamic to research. Benefit from proximity to your colleagues, and allow them to benefit from you. You don’t have to treat it like a 9-5 job, although there’s nothing wrong with that. Your colleagues and advisor should expect to run into you in the halls, or in your office, if they have a question at some point on a typical day. (I.e., every day.) They shouldn’t need an appointment. Of course there are exceptions if you’re on holiday, sick, or traveling for work.

  • Keep a calendar, make lists, be punctual. You can’t remember everything all the time. Use modern tools to keep track of where you need to be and when, to whom you owe what and when, and be on time to events and don’t miss deadlines. Be early - even better.

  • Respond to email. Make an attempt to respond to every personal email message (that means everything except mass/automated messages) within one business day. That doesn’t mean you have to address everything immediately. But acknowledge the message; say thanks and say you’re on it! Come up with a system - a to do list - for addressing unfinished action items from emails. I like to use my Gmail inbox for that: not archiving messages until all action items are satisfied, and I don’t let the inbox grow longer than the (laptop) screen. That may not work for you. Come up with something else/better for you if you must. But don’t make excuses. Treat this as a priority. Ignoring messages is rude. Stonewalling isn’t mature. Letting action items slip through the cracks is irresponsible. The inbox system (as I implement in Gmail) is hundreds of years old and it works. It’s where the word inbox for email comes from.

  • Under promise, over deliver. It’s much better than the opposite. If you can’t do it, say no. But know that if you say no too often, eventually people will stop asking and opportunities will become scarce. If you can do it, do a good job and be on time. Don’t procrastinate. Don’t get into a downward spiral of apologies and expectations management. Put your energy into projects, not excuses.

  • Respect others, especially their time. Don’t email your advisor with results at 11pm or on the weekend because you want them to know you’re working late. Wait until business hours, and look over your work in the morning first with fresh eyes. There are of course exceptions for deadlines and when you’re really excited. But be careful to use this sparingly. You want to respect your advisor’s time because you want him/her to respect yours. Similarly, don’t ask someone else to do something you could’ve done already before you finished asking. By the time you emailed to ask “what is AOAS?” or “what is stonewalling?” you could have Googled and gotten your answer. (Also if you wonder about something, find out!)

  • Keep your workspace clean. Don’t stack a mess of papers and food wrappers around your desk and think that your office mates won’t mind because the mess is contained to a few square feet around your area. They mind!

  • Attend seminar and other department events. You’re not too cool for school. You will learn from seminar even when it’s not on a topic of interest to you. Pay attention to style if nothing else. What works, what doesn’t? You will benefit from meeting the speaker at the reception afterwards, if only to challenge yourself to hold a 5-minute conversation about what you do. Who knows, maybe they’ll remember meeting you if you apply for a job in their group some day. You will benefit yourself and the department by being present, and representing us well. If nothing else, that will increase the value of your degree through the value of our reputation. There’s no excuse for missing a seminar or other department event if you’re healthy and in town. Having a homework due, or some other deadline is a not valid excuse. Manage your time better.

  • Weekly and group meetings. Weekly individual meetings and group meetings are mandatory. During group meetings we will read and discuss papers. You might be asked to lead one of those discussions. Each semester be prepared to give a 30 minute update, with slides, to the group on the progress of your research. Protect these times.

Reading and writing

This is not as exciting as discovery, but you need to know what comes before and you need to market what you find.

  • One paper a week. Plan to read one paper a week. Sometimes papers will be assigned, other times you’ll need to follow your nose. Track down popular references. Understand pedigree in your area. Internalize the structure of papers. What makes an AOAS paper different from JASA T&M, different from JASA A&CS, AOS, AOAS, etc? You will write one or more of these eventually. Don’t explain to your advisor that you don’t know how to outline a statistics manuscript. Approach papers critically. Learn from them, and imagine how you’d improve upon them. Keep a list of good ideas.

  • Write every day. Write anything - get practice! Keep an annotated bibliography of what you’re reading. Typeset your mathematical derivations. Explain your code and outputs in reproducible formats (like Rmarkdown) and bring those write-ups to meetings with your colleagues/advisor. Write to learn. If you can’t explain it in writing, then you don’t know it well enough. When drafting manuscripts, emulate the writing style of your favorite papers. Make them look professional by learning to be proficient in the tools that make that possible (LaTeX, BibTeX, markdown, etc.). Set reasonable goals and don’t compromise. Write one good paragraph a day, complete with citations and mathematical rigor and proper punctuation, and you’ll be fine. You’ll be done with your manuscript in under a month, or your dissertation in less than three months. Use spell-check.

  • Use version control. And I don’t mean “track changes” in Word or Dropbox versions. See the code/computing section below. Yes: for your writing too, not just code!

  • Grammar is secondary. Many of you are not native English speakers. That’s fine. I admire you. There’s no expectation that you write beautifully. But that’s no excuse to avoid practice. Exactly the opposite in fact. Start by writing in your native tongue if necessary. However, if you are monolingual-English like me then you really have no excuse. Figure out how to get your scientific ideas across on paper. Start by mimicking others.

  • Remember what you’ve done. Don’t freeze up if your advisor says it’s time to start writing. You’ve been practicing, so it’s no biggie. You must have accomplished something along your research trajectory to get this far. Start by making a list of what that accomplishment entails. What are the ingredients that must be reviewed before the reader can appreciate your contribution? What makes it challenging? What makes your solution innovative? How can you showcase what you’ve done in the most favorable light? How can you convey something you’re excited about in a dispassionate, matter-of-fact manner that doesn’t put off your reader or annoy the other people whose hard work you are improving upon? How can you celebrate their work at the same time? Help your readers appreciate the details that bring it all together - that make what you’re doing unique and special. It’s much easier to sculpt from a big block of clay, cutting/reducing things later that get in the way of a smooth narrative, than it is to dab lots of little bits on after-the-fact.

  • Think of your audience. Who is going to read this paper? Referees and editors, yes. Beyond that, mostly graduate students. That is, mostly people a year or two younger than you. Don’t forget what it was like to be that person. Remember that they know less than you do now, because you knew less two years ago. Explain to them in terms that they’ll understand. Guide them toward the relevant literature while summarizing the main elements required in order to appreciate your contribution. Don’t be macho and gloss over details.

  • Use citations as punctuation. Don’t claim something without evidence. If it’s part of the canon, cite a textbook chapter. When citing papers, be inclusive. Don’t leave anyone out. Software packages should be cited just like papers. The authors of software are not always the same as the authors of the methodology they implement. Start with the citation() function in R. Don’t cite something you don’t know about, but at the same time don’t feel obliged to read every paper you cite. Find a happy medium. Remember, you should be reading one paper/week anyways.

  • Nothing is obvious to everyone. Don’t assume your reader knows something unless you knew it before you came to graduate school. Explain why. Give subtle reminders. Be transparent in your thinking. Take the opportunity to give the reader insight into your perspective. The impression that you think it’s obvious can easily be confused with the impression of ignorance behind the veil of machismo.

  • Keep it simple. Don’t introduce notation, or give formulas, or remark on concepts that aren’t crucial to your exposition. Don’t speculate, or suggest that your method could be applied in greater generality without direct evidence. Save that for future ideas in your discussion section at the end, or not at all. (That doesn’t mean skip writing those things entirely. Put your thoughts down, even though you may cut them later.)

  • Editing and writing are similar but different. Writing first drafts is hard, but you’ve got to start somewhere. Editing, or re-writing, is a huge part of the process and can be time (and emotionally) consuming. It’s easier, in a sense, because it starts with criticism which to most people comes more naturally than creating from scratch. But you have to be comfortably self-critical and willing to act to make improvements. Get to the editing stage as soon as you can, but don’t forget to fix the problems you find with more writing. Even better: let others criticize your work. It’ll help you become a better self-critic, which will make you a better writer. Don’t be afraid to start from scratch. Sometimes starting over is easier, but it takes bravery to brush aside all that earlier hard work.

Code and computing

Someday, perhaps, this won’t be it’s own section. Computers aren’t a tool. They are your instrument.

  • Keep your workspace clean. Keep your files organized in folders. Don’t litter your virtual desktop and rely on search to find what you’re looking for. Use shortcuts, aliases and links to help with navigation. Make your virtual workspace as tidy as your physical one. Keep your computer up to date with the latest software and security patches. This applies equally to university workstations and personal computers/laptops. Fix hardware problems. (Don’t try to cope with a broken keyboard/screen.)

  • Learn how to use your computer. I don’t mind what kind of computer you have at home/in your backpack. (Windows, Linux, Mac, Raspberry Pi.) But I do expect you to be proficient with it. We have Linux machines in the lab, and I expect you to learn how to use those. And I expect you to learn how to get your own machine to do everything those Linux workstations can do. That might mean learning some DOS, installing WSL, running an X11 client, installing compilers and other libraries, etc. You are responsible for your personal computing environment(s). Not knowing how to do something on your machine is not an excuse. It’s the 21st century. Your computer can do it. Figure out how.

  • Become fluent with a full-featured text editor. Not notepad, not RStudio Editor, not TexMate. Those are not full-featured editors. They’re not bad for their niche (RStudio’s Rmarkdown profiling and debugging features are excellent), but you need more. Examples include vim, gvim, Emacs, Sublime Text, and so on. You should also be proficient with at least one terminal editor such as vi and emacs -nw for when you remote login to edit files.

  • Remote access. Learn how to access the lab Linux machines from home, run jobs remotely and in the background without hangup on logout, transfer files, export graphics, etc. There’s no reason to walk down the hall with your laptop open so that your simulation can continue to run. Learn to use the state-of-the-art and always-on computing facilities available to you.

  • Don’t use your computer like an office secretary. Use short meaningful filenames (including for directories) without spaces, punctuation, or special characters except underscores. Use extensions appropriate to file types. Organize your file space hierarchically. Use version control, see below. Don’t paste your plots and code output into a Word document.

  • Use version control and commit every day. All of your research work (including coursework) should be in version control with Git through GitHub or Bitbucket. All lab repos are on the latter. Make a private repository to use for your homework, and other personal work-related things. Use one of the lab repositories for your other research, or discuss the possibility of making a new one. Commit your work every day, possibly more than once a day. Repos are not just for finished work, and are most useful as a tool that syncs your workspace across machines (home, work, laptop(s)). This is not just about code, but also about notes and writing. Don’t commit binary (e.g., .RData) files to the repo. They don’t benefit from rsync versioning, are not human readable, and risk overflowing quotas. Image files which are used in documents are an important exception.

  • Writing is code. It should also be in version control.

  • Dropbox and Google Drive are not a substitute for backup and version control. Use these sparingly, except in the case of Duplicati backups, or similar.

  • Backup in multitude. I don’t want to hear that your dog ate your thesis or your cat jumped on your keyboard and you lost everything. In addition to Git/Bitbucket cloud versioning, consider using programs like Duplicati to automatically backup your entire machine to the cloud. You can use Google Drive (with Duplicati) for this if you wish. (This is a great way to use the nearly infinite cloud space VT provides.) Use tools like Time Machine (OSX) or similar to backup to a local hard drive as well. Emailing files to yourself is not a plan.

  • Stand on the shoulders of giants. Use the code and tex produced by members of the lab in the past. Find these in the repos, but don’t plagiarize. Give back by making that code better if you can.

  • Comment and adopt style. Comments are huge. It’s not just about helping someone else understand your code. Be selfish. It’s about you understanding the code you wrote in year two when you’re writing up your dissertation in year five. Don’t let a chunk of code go by without an explanation. Adopt a coding style that’s consistent with best industrial practice. (Google that for your language, e.g., R or C.) Choose short, descriptive variable names. No silly words. Good style makes code easier to read, eases debugging and prevents other mistakes.

  • Don’t clean it up later. Clean it up now. There’s not some magical time in the future when you’re going to be commenting and making your code look nice and run more efficiently. You’ll have moved onto something else by then. Do it now while it’s on your mind. You might need to show this code to someone on short notice. You’re regularly committing your code to the repo, so your advisor or office-mate might be looking at it right now. (That’s not an excuse for not committing it in the first place.)

  • Debugging is 90% of coding. This is like writing and editing above. It’s hard to write the first draft, and it is most certainly wrong to start out. Even the fifteenth iteration is probably wrong. There’s always a bug or a better way. The analogy to writing ends there. The trouble is, it’s hard to criticize and fix code without a proper toolkit. Use debuggers (like gdb, valgrind, and RStudio’s tools) and profilers to help. Good style helps with debugging. View a deceptive bug as an opportunity to make your whole code better. A lazy approach to debugging, by staring at code until you see the problem, is a great way to waste time and end up exhausted. Tinkering is almost as bad. Take a principled, systematic approach to debugging. Form a diagnosis by benchmarking against known truths. If you’re not sure what behavior to expect, play devil’s advocate: try to argue why what you’re getting is correct until it becomes absurd. When you come to a gap in your knowledge of what’s right/wrong, do some research to fill that gap. (Don’t just bring it to your advisor and say “it seems wrong”, or even worse: “it’s right but it’s not working”.) Check the values of variables, in an automated fashion if possible. Build the scaffolding required to visualize the state of your code/method with plots.

  • Comments are not if statements. Comments are convenient ways to turn off lines of code, but they’re not a long-term substitute. Don’t have a code file with half of it live, and the other half commented out. Make two separate files, or use actual if statements and/or other more appropriate devices. Be deliberate in your implementation.

  • Reproducibility. Code in a repo should be runnable by new users with minimal setup. Anything special that’s needed to get going, like a symbolic link to a data directory or some other code that should be run first, must be clearly noted at the top. Never reference absolute file paths in your code because that’ll make it hard for others to use out-of-the-box. Code supporting results and figures in papers should be readily accessible by readers through a repo or other outlet on the web, such as a rendered Rmarkdown document.

Sprint to the finish

Keep your eye on the end goal from the very start, but break things into manageable chunks and take it one day at a time.

  • What I will do for you. My deal with students primarily involves writing. (I’ll meet with you regularly, and help you navigate research. We’ll have group meetings too. I’ll look at math with you, and at code, and try to help as much as possible. But ultimately you are responsible for day-to-day progress. See individual and group meetings above.) You need a proposal document and a dissertation document, with the former being a pre-cursor to the latter, in order to graduate. I will not write those for you. Those are your responsibility too, although I will look over drafts once and provide comments, which you may choose to take or leave. What I will do is help you write papers, and you may use that text in your dissertation. You will write the first draft, but after that I will work with you to see the document through the publication process. I will ensure that it meets a standard where I am happy to have my name on it. Sometimes, particularly in early stages, my help will come in the form of feedback or requested changes. Towards the end I shall make edits myself. Attempting to short circuit this process, going straight to the dissertation without my help writing papers first, is a bad idea.

  • Three papers. Students dissertations’ in my group are usually comprised of two to three papers. Typically one is accepted for publication by the proposal stage. (Getting a paper accepted is sometimes years worth of work and waiting.) Often another is near submission with a third comprising of work proposed for the final stages. If you have three papers worth of material, you are ready to defend your thesis. If you don’t then you’re on shaky ground. That doesn’t mean you won’t get the degree, but it does mean there are big challenges ahead.

  • Pipelining. I try to get younger students involved in research, helping with a more senior students’ project. Maybe working on graphics, or performing sub-analysis. This helps get noobs on a paper early, see how the process goes, and make a friend in the group. This can count toward one of the three papers above. Older students can expect to serve as mentor for a new member of the lab.

  • Read theses and borrow templates. This is the thesis version of several “emulate” suggestions above. Download award winning theses in your area and read them. What makes them good? Can you emulate that? Don’t be surprised at the substantial review component in dissertations. It’s way bigger than an ordinary paper. Get templates from your senior colleagues and copy their style.

  • Getting a job. It’s not just about writing a dissertation. A PhD is about building up a CV, though publications and other experiences, that can land you a job or a postdoc afterwards. Your committee will be concerned about “getting you out” without a place to go. You’re much better off with another year as a student, looking for a job and doing research, than you would be applying for jobs while otherwise idle. Potential employers will read the tea leaves, if you’re jobless and no longer a student, and conclude that you’re tainted. We will work together to get your research to a place where you can attend academic meetings, make presentations and make contacts, and ultimately make strong personal impressions and competitive formal applications to research positions in academia, industry, and beyond. Applying for jobs is like a second job and it happens during the last nine months of your degree. There’s lots to do at the end. It is a sprint at the end of a marathon.