PRE Workshop 2023
How to be a Successful Predoc
Rashmi: Hi everyone. Today, we're going to be talking about how to be successful as a pre-doc RA, and so we've invited a handful of panelists to talk to you a little bit about that. I just wanted to get started. Can everybody go around and introduce themselves? Tell me, what pre-doc program you're affiliated with. And I can start. I'm Rashmi, an RA at the Chicago Federal Reserve.
Brianna: I can go next. I'm Brianna. I was an RA at the Chicago Fed for about a year and a half, and then I just stepped into an analyst position this February.
Kira: I'm Kira. I am a pre-doc at MIT Sloan.
Jacob: I'm Jacob and also an RA at the Chicago Fed.
Pascal: I can go. I'm Pascal. I was an RA at Brookings, back in the day and now am a professor at Chicago Booth. I have had several rounds of fantastic RAs go through our program.
Rashmi: Great. So I know one thing that people think about a lot when they're going into a pre-doc program in terms of being successful is having all the skills that they need. To start, what are some technologies that you've seen used in your pre-doc program? And is this something that's changing? And that can include programming, languages, software, anything?
Jacob: Is that just for anyone?
Rashmi: Yeah.
Jacob: I mean I guess you guys probably all know that like STATA and R are very common. I assume you'll see some variants of those wherever you go. At the Fed, we use a good bit of MATLAB and Python on some projects, but very few people are using all four at once. And I don't know here at least you didn't really like knowing what you were going to be assigned. So it was really more about willingness to like to learn more than actually just knowing every single language possible, because there's always another language that you don't know that someone's using. You know, someone's going to have you use SAS or something, and you know, you can't cover all your bases and no one expects you to beforehand. Obviously you just have to be willing to learn, you know, and put in a bit of time if it is something you're unfamiliar with.
Jacob: I'd also say LATEX is pretty common. Sometimes if you're like, you know, doing a bunch of analysis and you get to write some stuff up or you know, or your economist has an R&R somewhere and they need help formatting the paper, you got to help them with Latex stuff. And so I think a good way to practice is make your resume or CV in Latex. That's what I did. It was a good way to get familiar with it. It's a little challenging at first, but it gets much easier. I think that's something that's also relevant.
Pascal: I can add a couple things. One, on the programming languages, I totally agree with the one that Jacob mentioned. One other one that I've seen commonly used is Python and we're using that more and more in my lab. And others that we're working with are also using that more. Another thing that I'll emphasize, which is similar to what Jacob said is that what matters most, I think is familiarity with a programming language and being able to think about how to use programming to answer a question and familiarity with the specific programming language oftentimes is, is less crucial because as Jacob said, sort of like you, you'll learn a lot on the job. If you need a new programming language, once you've learned one, you can pick up another one pretty quickly. And so at least in our group, we always just expect people to not necessarily know the language that we're part the particular language that we are going to use and they'll, they'll learn it along the way.
Pascal: Part of the whole point of doing a pre-doc is to gain skills and to learn, uh, as you go. And so learning new languages, uh, and programming skills is, is exactly part of that. Uh, the other thing that I'll say that's not on like the programming but sort of in the late tech bucket that Jacob mentioned is like a, as a tool is GitHub. And so we'll use GIT a lot, uh, and it's especially really important as teams start getting larger in version control is very, very crucial. And so some familiarity with GIT is, uh, is useful or at least a willingness to, to learn, uh, how to use Git. Uh, and then it also can be pretty useful as a project management software where you can create issues and you can communicate on progress around issues answering a specific question in a, in a large team.
Brianna: I think it's kind of been touched upon, but I think it just depends on what department you're in, what you're going to use. Because I've never used MATLAB, I've never used Python. Um, so I'm in the community development department, so we're kind of pretty different, but we use like ArcGIS, like mapping software, Tableau. I think places are kind of moving more into a vis situation, so just knowing like you don't have to know that going in, but that's something that I've learned on the job.
Kira: I can just add, you know, I agree with everything everyone said, but like anecdotal anecdotally, I, I'm a psychology major in undergrad I knew a little bit of, um, R and Python and so when I came into this pre-doc I worked with one professor who's more psychology based, we use more r and the other one's an economists, so she uses a lot of Stata and I didn't know anything about Stata, but again, having that little bit of background in coding, at least having some um, basis is really helpful and you'll be able to pick it up really quickly. I know I did. If anybody's worried about that, you know, you'll be able to pick it up very quick.
Rashmi: Yeah. So in kind of a similar vein, what economic and econometric concepts do you see most commonly used in your pro pre-doc program?
Brianna: Oh man. Okay. I feel like I'm an odd man out because I'm in the community development department, so we do more like qualitative work, it's a lot like talking to people and it's using economic theory to just understand what's going on in the ground. So I guess I'm not really using quantitative methods so much right now, but it is like a good basis of understanding how the economy is working and like affecting real people, I think.
Pascal: I'll say in the projects that I've worked on that a lot of the groups that are hiring pre-docs out of Chicago booth, especially sort of like in applied microeconomics, I would bucket it in two, two areas. One is a lot on, just like, you know, causal identification strategies. So some of them are very simple, like a difference in differences or instrumental variables. The kinds of things that will often get covered in an undergraduate econometrics class are very, very useful and we'll use them all the time. Uh, and then you can also start, once you've done those, you can also, sometimes we start doing more advanced things with other kinds of strategies. Something like a regression discontinuity strategy or a, a matched control strategy or something like that, a synthetic design. So that sort of bucket one is descriptive data work and then also some sort of SAL identification strategies.
Pascal: And then another might be for some projects you'll sometimes have a structural model where you'll use the computer to solve some household optimization problem or certain, some macroeconomic optimization problem where you're not necessarily trying to like look at raw data and measure a measure a causal effect, but you're using you're building a mathematical model and using the computer to solve that model. Oftentimes that is also helpful and is a different kind of skill, but again, something that oftentimes, we will just teach people along the way and it's more having an open mind to doing that kind of thing that we're, that we're looking for.
Jacob: If you're thinking about what to prioritize as an undergrad, structural stuff is going to be hard to find in any undergraduate course. I would say yeah if you have a good math background, I'm sure you'll be able to pick that stuff up. The stuff you're most likely going to be able to get in an undergrad course and that's feasible, is going to be more useful to you immediately is. Panel and time series stuff and those causal identification methods. And I think it's really helpful to keep it simple and actually really know what's going on, especially with like all the diff and diff you know, new literature and stuff like, you know, I'd say like, you know, don't just be like, I saw diff and diff once, now I want to learn a structural model.
Jacob: Read some stuff, work with data, you kind of won't be able to…it takes some practice to really get that stuff under your belt and have a more fundamental understanding of it. But it can be really helpful to your economists when, you know, you, you kind of know like the caveats that come with the regressions you're running, um, instead of them having to tell you everything. So obviously you still learn that stuff on the job, but you can get some of it from Yeah. You know, IV and d diff from, uh, from econometrics classes in undergrad.
Kira: I'll say again, as someone who came from a non-economics background, with the economist professor that I work with, I ended up doing a lot of basic progression models, doing some Oaxaca blinders. Things that are sort of familiar for me coming from quantitative psych that were pretty easy to pick up. I work with somebody who's more macro and labor, so not maybe as specific as some of these other econometric stuff that other panelists have talked about. When you join a program, it’s good to see what kind of research that a PI is doing and see the kind of econometrics they may be doing and maybe get yourself familiarized or again, it's something that you can definitely learn on the job.
Rashmi: Yeah. Thank you all. I also want to mention really quickly for the participants that we're going to save about 10 or 15 minutes at the end for questions. If you're curious about anything you want to follow up on, keep note of that and at the end you'll have a chance to ask those questions. I now want to pivot a little bit more to like general RA characteristics. Do you have any thoughts on what distinguishes good or decent RAs from great, very successful RAs.
Jacob: I'm going to jump in there and say attention to detail, um, which is something I struggled with and I started here. I've been trying to figure out how to articulate this for a while as I've learned it. I'm only about a year into the job, so I'm still actively learning stuff, but I'd say when you're an undergrad and it's your first coding language and your first econometrics class, you're really worried about avoiding the red code, you know, an error. But I think what's much more important is once you get past the point where, once you get to the point where you can write code that won't break, you think everything's working, but what you don't realize is that data is very tricky and like if you change like the order of two lines, you can completely change your results depending on what you're doing.
Jacob: It's really when your code runs all the way through knowing to still go check and be like, wait, is everything actually right? And that is, and I think you know what your economists or you know, your PI, whoever it is, the best thing you can do as an RA for yourself and for them is when you send them an email with some plots and regressions that they know it's right. Maybe they want to do something new, that's fine, but like knowing that there's no error in that code and that everything is as they expect it to be because they're not going to be able to go through and check every line of your code. And then if you don't check it either, then six months down the road when you're trying to write something up is when you're going to find the error.
Jacob: The RAs that I know at the Fed that are like truly I think the really outstanding ones are the ones that are just really, really careful. And that is, a skill on its own and it takes time and I'm getting better at it. When you send stuff like really describe what you did, keep track of the choices you made when cleaning data because there's infinite permutations of how you can clean data and stuff like that. You might not realize how important it is until you find a scenario where everything changed because you switched one online or took something out. I'd say just paying attention to detail and being careful and being articulate about exactly what you did in your code is really the best thing you can do to be a good RA.
Brianna: I think something that I was kind of surprised by is how self-motivated you need to be, because in college I feel like every week they were like, all right, next week is what you need to do. But here they'll be like, all right, this is the entire project. So you run with that. If you don't have the ability to set deadlines for yourself and to just kind of like do it without external motivation, I think that would be hard.
Kira: I kind of like two things, and one is kind of really related to what Jacob said is sort of balancing efficiency and thoroughness. I think especially at the beginning, I was sort of sacrificing a little bit of thoroughness for efficiency. You know, you want to get things done, you want to show that you are a good worker, but part of being a good worker is having that thoroughness is that attention to detail and, um, making sure you know, it, it can be a coding thing, it could be, you know, a lit review thing. It could be kind of any part of the research process, you want to make sure that it's not only done in sort of a good amount of time, but also that you've paid a lot of attention to what you're doing. And I guess the second thing I would also say is, um, you know, all the RAs that I've been able to work with or, or work next to, um, you know, everyone's good at their job, everyone can, you know, come in and do the work.
Kira: There's often a lot of opportunity to go above and beyond that and say, you know, this is a topic that's really interesting to me, I'm going to ask my PI or other PIs, you know, is this a topic that I can look into, like when I have a little bit of free time, you know, is there something I can do with this? I have an idea about this project. I think when you go above and beyond you, you show that you're really interested, you know, it shows your PIs, you know, you have the sort of ability and want to do research and shows sort of skills that you can bring into PhD programs as well.
Pascal: I can add, I very much agree with the thrust of what Jacob and, and, uh, Kira were saying. And is it Kira or Kyra? I'm getting her right.
Kira: Kira.
Pascal: Kira, okay. Got it. Right. Where like I would say number one is attention to detail and especially from a PI's perspective, the, the, the pre-doc is doing a lot of, a lot of work and a lot of fantastic work and we would much, much rather have a small amount of work that we are confident in than a large amount of work that might, that we're not, that we have uncertain confidence in. And so like even if we, even if the error rate is 3%, we would rather have half as much work with a 0% or 1% error rate because then it's like, if you're not, if you don't, if you don't know where the bodies are buried, then you're worried that the whole thing where, where, where the issues are and you then end up spending a lot of time trying to figure it out.
Pascal: Talking to your PI and getting a sense of exactly what they asked for and doing that and doing exactly what was asked, understanding the nitty gritty of what you're doing is like, is like super, super important and is like priority number one. And one thing for people who are applying for pre-docs, oftentimes there'll be a data task. And one of the things that the data task is actually sort of assessing is ability to follow directions and attention to detail. There might be something parts of the data task that are like, you know, remember to do this kind of filtering or remember to do this, like actually do that. That'll be a way to signal that you have attention to detail. Now that's sort of, that's like the most important thing in the baseline thing.
Pascal: There's one thing like just do what has been asked and produce the regression or the figure that has been asked. But then the fantastic RA someone, something that puts an RA over the top is when they're able to interpret what's been done and say, here's a figure, I made the figure that you asked for, here is what I think this means. Here is what I would've expected to find. Here's why it's different. And because it was different than I expected, I then did these other three tests that you didn't ask for, but now I understand this better and I think maybe this will help us understand as a project what to do next. And so like the, the, the best RAs or pre-docs, what I would be are what I would call analysts, not coders.
Pascal: A coder is like attention to detail, just do the thing and then the analyst is putting their thinking hat on and like, okay, I've, I've done this now. Like, let me interpret it, let me understand it. Let me tell you as a PI how you should think about it and let me run a few more things to better understand it. And so the, the, the be the best pre-docs are able to do both of those things. Then also I think you learned something about the world along the way.
Rashmi: All those responses all definitely align with my experience in the past few years as an RA as well. We touched on this a little bit, I think a few people talked about it, but are there any skills you can think of more geared toward the pre-doc, but I think everyone has a good perspective on this. Any skills that are required as an RA on the job, but a successful university student might not necessarily have needed in college?
Pascal: I think that Brianna touched on this, which is that, you know, in college you will have very detailed instructions on, okay, you have a class next week, you have a problem set due, and you have a paper due, you have an exam. It's like every task, you know, is doable because they wouldn't have assigned it if it wasn't doable. And, you know, you probably have all the tools to actually answer it, uh, along the way. Like they've taught you what you need to know. Whereas in a, in a pre-doc position, sometimes we'll give, we'll give tasks that the pre-doc doesn't yet know how to do and has to go and figure out and learn the skills necessary to do it. And so you need more, uh, self-starting, you need to, you need to go and know where to look to find the answers to your questions.
Pascal: Find the answers to how you even do something or like your code doesn't work, and you need to figure out where to go online to fix, to figure out how to fix broken code. There's a self-starter and figure out how to get things done when you aren't necessarily told along the way, here are the nitty gritty of every single thing that I'm expecting to be done. And then a second thing is that like time management I feel is very important. Obviously like it's also important as an undergraduate, but again, like there's, because deadlines are so clear in college and you know exactly what you have to do by one, it's sort of enforced on you and it's not as clearly enforced in a pre-doc position where we're not going to have deadlines for every single sub part of a project, but it is important to like, keep a list of here's everything I need to get done, here's the order in which I'm going to do it, and work through it.
Pascal: Then the third thing I'll say is managing up. So when you're, when you're in college, like you're like a team, essentially, like a team of one and you're trying to work on a class or work on a project. Whereas when you're in a team, you're also contributing to a larger research project and, uh, ability to like tell your PIs, here's what you should be asking me. Or work with your teammates to figure out how to move the project forward. Rather than just thinking about executing just your part of the task is also very helpful.
Jacob: Yeah, on that kind of note, what's cool about research, right? I think what a lot of us like about it is that you don't know the answer, there's no right answer, there's no answer key for, you know, for the project you're working on. It's not, you know, it's not a class where the professor usually knows the answer and there's a correct answer.It can feel very like, yeah, it's just that self-starting kind of, you, you know, you have to be okay with not always having a sense of direction the second you start something, you know, you've got to be okay with just, you know, doing a lot of exploratory work and also learning that like a lot of the first, you know, graphs and regressions you run are not going to be in any sort of final product. You know, it takes a long time to, to understand the problem you're working on and, and then actually then build some structure around it and then go forward.
Jacob: I just think kind of, you know, I any chance you have to kind of work, you know, with research or write some sort of thesis, anything where that's like kind of outside that classroom experience that, you know, peer structured classroom experience is really, really important because it's a weird thing and it, it's not natural in a way, but like, you know, it can be learned like the process of research, but it, you know, it takes time. Um, but that's what's so special about it and that's why it's kind of different from, you know, school in a way. But, uh, so yeah, I would just kind of, you got to be comfortable being uncomfortable with not like knowing that there's a right answer that you, or that you know everything to do the task as, um, Pascal said, yeah,
Kira: I'll add to that. And, and very similar, similarly to what's already been said, but there's an incredible amount of autonomy that you have, um, as a pre-doc versus when you're in school, it's a job, you know, it's not, you're not going to classes, you're doing work and anything you want to learn, you can teach yourself where you can find resources, where things are being taught. Um, so yeah, it's a little bit of a difference between, you know, undergrad where, you know, you are, you do have more autonomy than maybe you had in high school, but um, it's still been different than when you go out and have a job and especially a research job where there is even more autonomy in sort of figuring like, um, Jacob was saying, like figuring things out and, um, teaching yourself again. Yeah.
Rashmi: When I started as an RA, the relationship building and sort of teamwork aspect of the job can be really important, especially at the Fed. I think I've, you know, gotten the chance to work with a lot of different people on different projects. Um, so could you give, you know, perspective pre-doc, any advice on that relationship building? Especially as a lot of people are looking to apply to graduate school and, you know, get solid reference letters, that kind of thing. How have you done that and what advice would you have?
Brianna: I mean, they're just people and they're obviously interested in the same thing that you are, so I think it's pretty easy to just talk about the work and then maybe get to know them a little bit beyond that. I don't know, it's been pretty fun for me to get to know my colleagues, I think.
Kira: I agree especially, um, because I'm in a university setting and there's a lot of, you know, not only other pre-docs, we have our own malaria, we get to sit together, but, um, there's also a lot of PhD students. There's seminars that we can go to, we can meet a lot of others, professors and students and other pre-docs, um, not only in our department, but in other departments as well. For me, I've done projects where, um, my PI has worked with other professors or other students, and so I've been able to make relationships that way. So, uh, like, um, Brianna said like, it's very easy, it's part of the job kind of to make relationships with people. Um, and I, if you put yourself out there, then you're going to be able to make them pretty well.
Jacob: When it comes to relationship building with, you know, your actual PIs and, and you know, your economists and who you work for, um, I was kind of under the impression when I started that, like, once I made a mistake, the re like, it was over, like, they were going to be like, oh, you know, I'll move on to the next RA. Like, you're not up to the task. And like, no, they expect you to make mistakes probably more than you realize. They expect you to mess up. Um, so like the relationship does not end once you fail once, like they, they, they're okay with that. That is, you know, you got to be honest and you know, kind of transparent about like, oh, here's what I did, was it wrong? You know, but, you know, be careful about it. But at the same time, like they don't expect you to get a lot of stuff, everything is right on the first try.
Jacob: Especially if they give you a task that's just like, you have to go learn how to do it, you know, it's almost inevitable that you'll mess it up at first. And so, you know, as long as you're kind of transparent and clear about what you're doing, which is kind of back to my first thing about, you know, thoughtfulness and thoroughness versus efficiency, um, then, you know, I think they'll, you know, as long as you're showing that, you know, you're interested in, you know, passionate about the work and, and that, you know, you're willing to learn new things and it's okay to make mistakes, and that's honestly part of the job. And so I think just knowing that like, yeah, your, your economist is not actually mad at you, you know, for making a mistake. It's just kind of how it goes. And again, as long as you're just clear about what you did so you can find the mistakes and improve, that's really all there is to it. And that, you know, and again, as long as you care and are interested in it, then they're going to, you know, they're, you're going to get better and they're going to see that and you'll, you'll be able to build a relationship.
Pascal: I don't have anything more.
Rashmi: Sounds good. So Kira mentioned seminars and sorry, I just wanted to touch on that really quickly. In that same vein, are there any resources that your pre-doc program has that you wish RAs would use or that you would advise RAs use?
Kira: Yeah, I'll jump back in on the seminar point. At least at MIT there's a lot of… within the department sort of seminars and pre-docs here can go to any of them. I know sometimes people don't take advantage of that, but I think it's really helpful because one, a lot of times there's visiting scholars who are presenting their work that can be really interesting and inspiring, but also, you know, people from the department or PhD students as well. So it gives you a look into what professors do, what students do, um, and, and gives you some, some things to think about for your work. So that at the very least, is one thing that I would encourage people to get involved in if that's available at your pre-doc.
Brianna: I don't think this is all pre-doc, but at least at the Fed, they let the RAs take two classes. Like any classes, you can take it at UChicago, Northwestern, I took one at University of Illinois at Chicago, I talked about a little earlier, we're using RTIS and I didn't know how to use it, so they let me go take a graduate course and I learned how to use it and I made a big project. I think it's just nice, you can take classes to prep for if you want to do a PhD or if you just have a skill you want to learn. So I think that was something really nice.
Pascal: Yeah, I'll just just echo what Brianna said. I think, uh, classes are also just very commonly, uh, offered as part of a pre-doc program. And again, going back to what I said about training, like part of this is a job, but part of this is investing in you and training you. And one way that that can happen is to allow you to take classes and, um, uh, often in consultation with the PI to, to say, okay, what do you want to do next? Do you want to go to a PhD program? If you do, here are the kinds of classes that will be particularly useful given the rest of your, uh, the rest of your academic record. And, um, this is something that I did when I was a, when I was a pre-doc, I took real analysis. I liked working in DC. I took real analysis at GW and it was a great experience.
Pascal: I took it with some other pre-docs at the same time. That's, you know, that kind of thing is very common if either your undergraduate institution didn't have some of these classes or you just like, you know, majored in something else and then you decided later on, oh, actually I want to go and do a, uh, a PhD, then maybe it's useful to take some more of these math classes or common ones, but other kinds of, yeah, econometric, other kinds of classes are, are offered and are very common parts of pre-doc programs.
Jacob: Yeah, I mean, I just say that like, I'm not sure how it works in a university setting, but at the Fed, a lot of economists have these slots to meet with, there's like three to four a week, usually seminars at lunch here. There's usually a slot for the RAs. I often meet with economists by myself or like the visitors by myself because I, I probably meet with one to two a week when there are people. I really enjoy doing that. You know, I think you just learn a lot about the field, you can ask them about their specific paper and stuff and that's great, and, but you can also just get their perspective on a variety of things. I've met plenty of people and I still learn something every time. I just really enjoy those conversations. I think it was in a different context, but they are just people and you know, they do happen to be smart and interesting people, but they're still just people. So talk to them, you know, if you're passionate about this, talk to them. They'll see it come through and you can learn something and yeah.
Rashmi: I think one more thing I wanted to mention is also with a lot of pre-doc programs, you have sort of a cohort of other RAs you can work with. And so I think one unexpected sort of resource for me was my fellow RAs. So just like connecting with other people who are in a very similar place in life with similar interests, um, and working together. So I know this past cycle when this cohort of RAs was applying to PhD programs, they got together and met and they would talk about their applications and that kind of thing. So I think that's also a really cool part of pre-doc programs that might not get discussed as much. Yeah. In your time so far as a pre-doc, what has been the most challenging or difficult part of your job and what has been the most enjoyable, your favorite part of your job?
Brianna: I guess this is something I talked about earlier, but being a self-starter was not something that I was very good at in the beginning. When I first started and I was like, oh, y'all aren't going to tell me the 10 steps I need to do to get this done? That was kind of like a transition for me. That was like a hard thing, but I mean easier every day. And now, I'm feeling pretty good about it. And then my favorite thing, I don't know, I really love doing the outreach side of things, which is not what most pre-docs do, I would say. But just connecting with people who are doing real things on the ground. I think I really liked getting to do the hard econ and then also like the soft people-based econ also.
Kira: I'll say that a challenge for me, kind of going back to the sort of efficiency and thoroughness kind of aspect that was really difficult, but an added thing on top of that is sort of, knowing when to ask for help, when to say, okay, I'm trying to do this all by myself. I can do it in this amount of time and I'm going to try to be thorough, um, but it's not quite working as I plan, so I need to ask for help. And, and knowing when to do that so you're not wasting a lot of time just kind of, uh, you know, trying to swim through it. And not really getting anywhere is not really helpful for anybody. Learning how to really ask for help, when I need it and not be afraid of my PIs, not that I was ever afraid of them, but being afraid of asking a dumb question or anything like that.
Kira: Not to do that is something that I'm still learning. And then sort of the best thing I think in general, like coming into terms with my research identity, I think this has been a really fun thing to do, um, throughout this and, and talking about my ideas and being in like, really heavily involved in projects. Um, I got to, I'm currently co-authoring my first paper at the pre-doc, so that's been super, super exciting. There's so many opportunities to do so many cool stuff with learning about yourself through research.
Jacob: Yeah, for me, kind of with Kira, it was really that thoroughness thing kind of realizing that they would actually, you know, much rather it take more time and it be right, you know, and that I just really did not get that through my head for a good six months. And so that took a while for me to learn, frankly. Um, but, you know, I'm getting there, so, uh, that was really hard and I just, you know, I kept going, oh, why are they so annoyed? There was a bug, you know, I did it so fast. And I was like, well, that's the whole point. I don't know for some reason that took a while for me, but that was definitely the hardest part. For me, my favorite part is, um, you know, I want to do what my bosses do and what the visitors do.
Jacob: That's the career I want. So I just love the axis that, you know, working at a, you know, pre-doc institution like the Fed or you know, a university like Kyra, like I love the access that you have to people who do exactly what you want to do. It's awesome to pick their brains. It's awesome to talk to them, hear them give their papers, their talks. They're kind of usually at the cutting edge, so you're really seeing where the field is right now. And you know, if you look, if you just read journal publications, they're like six years old. Seeing what's actually being worked on right now versus, you know, reading stuff that's kind of outdated is really, really interesting in getting to meet those people and getting basically equal access to them is the economist is really cool. That's definitely my favorite part.
Brianna: Just say one more thing. I think something really special here is that you come in with like 10 to 15 other people who are exactly your age and who have just moved to this same city for the very first time in their life. And I think if I had joined a corporate job where everybody was like 35 and they've lived here for years and they had kids, it would be a lot different of an experience. So it's like you kind of have a built-in group of buddies at work, which is something really special.
Rashmi: Yeah. So during the process of being an RA, a big part of that for a lot of people is deciding whether a PhD, specifically an economics PhD is the right choice for them. For the pre-docs, could you all talk about what your journey has been like, how you decided and sort of where you might be going next?
Brianna: Sure. I have my master's in economics, and I think that's about as far as I'm going to go in the world of econ. But I think this job was really good for figuring out if I did want to get a PhD. I mean, you really see what people do day to day when they have a PhD, clearly, like people like Jacob think it's awesome and that's what they want to do, which is also… and people like me, I'm like, I just don't want to work on a project for five years. It's just not like the way my brain works. I think it's good to really figure out whether that is or is not something you want to do before you jump right in.
Jacob: Yeah, I'm just going to compliment that because yeah you know, iyou really got to like see if you, you know, are passionate enough about the questions you're asking because it, it's not very fulfilling in the short term in the sense that yeah, projects take a long time way longer than they do in industry, um, and you know, there's a good reason for that, but still, it's like, you know, you've got to be willing to be patient and be really rigorous and do way more stuff than you think is necessary in order to ensure your results are, you know, correct and interesting and, you know, yeah, it's just a really, research is a long and kind of hard process and so, you know, it's kind of seeing like, what kind of research is being worked on, is that something you care about?
Jacob: And then even if it is what you care about, do you want to do research in general? You know, once you like that topic, it's still, research is not for everyone, even if you think it's really interesting. And so I think it's, it's really good to get an idea of, you know, and again, yeah, you see your bosses come in every day, you kind of see what their day-to-day is like, they're, you know, so I, if if, you know, the amount of time it takes them to do projects and stuff is not for you. If it's frustrating how slow it can move, you know, or like, you know, how many like kind of little setbacks you usually have in your average research process, uh, project, then, you know, yeah. That's good stuff to consider. So you know, it could be a little frustrating, it's definitely not for everyone. I think that's probably okay if it's not for you, it's really nice to get to see what it's really about. And, and I think that's, that's kind of one of the main things, is how long it takes and kind of the process and you get to experience that.
Kira: Yeah. I'll say again, my background in psychology. When I came into this job, I was looking more at the psychology aspect, the psychology PI that I have. I was thinking that I was going to like that work way more. But the economist I work with, she's a labor economist. She has a lot of stuff about unions and inequality and the stuff I love in psychology is about diversity and inequality. These kinds of things have a lot of overlap. And so, you know, I came in wanting, having the goal to want to do an organizational behavior PhD. And that's still the goal, but I think I'm going in with more of an interdisciplinary approach and interests in both psychology and economics research for sure. But yeah, having this experience has made me really appreciate the nuances of different research topics and sort of, not to brush aside a topic, you know, again, I took maybe macro and micro in college and I was like, I don't really, I don't really care about this. But actually seeing what the research is doing, and that there's all sorts of these little niches that can be super exciting to you. That's really, it helped me to figure out exactly the things I want to research and still is driving me to get a PhD.
Pascal: Let me chime in here a little bit just to say that there's like…I totally agree that one of the best arguments in favor of doing a pre-doc is that it can help you learn about your own interests and your own goals about what you want to do next. In particular, if you're thinking about grad school, that's a huge investment in most places, in most disciplines, you know, it's five plus years of a big investment. And for, at least for me as an undergrad, I had no idea really what research was or really what a PhD would be. So it was really helpful to work as a pre-doc to know, okay, this is what research is, this is what it would be like in this situation. And then you can decide if yes, is something that you want to pursue or something that you don't want to pursue. So I, I think I'm, I'm equally happy when we have pre-docs who use this opportunity as a reason to, to learn. They don't want to do a PhD in economics, because that's very useful. It's very useful to learn that early on before you're doing a five year PhD program. If there's uncertainty, this is a great way to be able to learn about your interests and learn about this research field without committing to something huge like a five year PhD program.
Rashmi: Yeah, definitely. Definitely. In talking a little bit about research interests, so for each of us, could you talk about what your research interests are and how you develop them and how pre-docs might go about developing their research interests?
Jacob: Yeah, I can go. Some of my research interests are econometrics and machine learning based. I kind of like a smattering of other vague interests, urban health, labor, but you know, who knows? But yeah, had I developed those, I mean I really loved my econometrics courses, which, you know, I, it's not for everyone, but I thoroughly enjoyed those classes. Um, you know, they were hard, I wasn't that good at them or anything, but I, you know, I enjoyed it. And I enjoyed doing econometrics and practice, which, you know, has been affirmed by his PhD. I think, again, yeah, getting to like to write a thesis like I was able to do at my undergrad institution was really cool because I got to put into practice some machine learning stuff. I had just learned in my stats course and you know, and I got to see that, I actually do kind of like doing this even when it's in the research context and it's much harder than just doing a little project or homework assignment for a class.
Jacob: And then again, yeah, I kind of do a lot of macro research at the Fed, but I did seek out one of our econometricians and work with them on a project and I think it's really cool. I'm kind of continuing to affirm that like, okay, I do like metrics, that's something I'm intrigued by at least. And you know, and I've also done a bit of labor work and stuff to see that that's still something. Someone on the cutting edge here, that's still something I'm interested in. And so I think, you know, at your pre-doc and before, if that's possible, just try the field and you'll kind of see. If you're repelled by it right away, maybe it's not for you, but if you're interested in learning more, you know, if the research is not a drag and you feel like you're actually interested in what you're working on, then that's a good sign. And so, you know, looking for the opportunities to find out if it still interests you after you do a little bit of it is a really good way.
Kira: For me, I sort of alluded to this in the last question, but my interest is in diversity and equality and organizations, I've sort of fallen into also looking specifically into academia, into to diversity or the lack thereof in academia, which is very interesting as somebody trying to go into academia as well. And for me part of it is sort of when you, some similar to what Jacob said, if you find something that sort of tugs at your interests, you know, keep pulling up that thread, keep trying to get yourself on projects that are related to that. When I was an undergrad, a lot of undergrad RA stuff, you're kind of like thrown in a project and you don't really know much about it, but any of them that were really interesting to me a lot of times, but like gender diversity. I want to do more with this.
Kira: And then ended up doing a research internship, a couple summers ago where I got to, that was the first time I got to own a project pretty much. And again, those kinds of projects about gender diversity in academia or gender diversity in organization. I was like, okay, I want to keep doing this kind of thing. And so when I was looking for pre-docs. I was looking for those kinds of topics that PIs were studying. And so it's further sort of put on me that this is what I want to do. Do yeah, I just like, if you find something that you'd like, like grab onto it and look for things that are related to that.
Brianna: I feel like I just kind of learned by doing. I was on a project that I was like, I'm really enjoying this, then I was like, oh, I guess this is my interest. I guess I'm interested in inequality broadly. I've kind of looked into some housing things. I just wrote a piece about when the interest rates were really low in Chicago. Like, who refinanced? And it turns out white people and it’s like, well, how can we make it more equitable? Stuff like that I think is really interesting to me, but I just learned from doing projects.
Pascal: One piece of that I can add on this is a little bit of a note that it is totally okay if you don't have well-defined interests or you take a meandering path to developing those well-defined interests. I can give a little bit of my path when I was probably in the middle of undergrad, I was probably most interested in development economics and then leaving undergrad, I started pivoting more towards public economics. Then when I was entering grad school, I wrote my statement mostly about labor and environmental economics. Then I got really inspired by a public economics class in grad school and decided to do more on those topics related to some of the stuff that I'd worked on as an RA. And have been then now more on like public and household finance and you're going to try a lot of different things.
Pascal: You're going to have exposure to a lot of different things and hopefully one or more of those things will inspire you. When you get to, and when you get to grad school, you can choose what you want to work on and you know, if you apply to grad school, you'll write a statement of purpose and you'll have to write something about the kind of thing that you want to research, but nobody ever will hold you to that statement of purpose. And it is like, it would be silly if we did. You're going to arrive in graduate school and then we're all, you know, professors are going to teach you. And so hopefully when we teach you, um, maybe we'll inspire you to do something that wasn't exactly what you thought you were going to do on the way in. And so you need to decide that you're interested in doing research, but needing to necessarily have a very specific topic of like, oh, I know what my dissertation is going to be about. It is very unlikely and is totally okay.
Rashmi: Yeah. Thank you all for that. Then one sort of last like very broad general question, is there any other advice you have, um, for prospective RAs? Anything that you think they should know,
Brianna: Have fun with it? It's not that serious and it should be fun and you should enjoy it.
Kira: Yeah, it looks like that. Also just be open to try new things, and meet new people and just have a great time.
Jacob: Yeah. Pre-docs are a learning experience and I'd say yeah, you know, be open to your path changing or maybe you really thought you wanted a PhD and let's say you find out you don't and it's a little upsetting. You know, if it's really not for you, let it kind of guide you. As everyone said, enjoy it if you're not enjoying it at all, you know, think about why, because like you know the reason I think I want to apply is because I really like doing the research I come in and do every day. And if I didn't, I would seriously reevaluate if I wanted to go down this path. And so, you know, enjoy it. Yeah. Talk to new people, be open to your path changing or your interest changing. I think that's what it's all about.
Pascal: One thing I'll add is that if you decide to do a pre-doc, it is usually a very close working relationship with one or more small set of PIs and another, a group of RAs. And it matters a lot who you, who is who you're working with. What I mean by that is that, you know the quality of the people, how much they care about you, how much they will support you, how much they will teach you and invest in you is really, really important. Much, much, much, much, much, much, much more important than how famous they are or what particular institution they're at. And so you should treat this a lot. You know you're going to invest in a group of people, and be in a group for a couple of years and being comfortable in that space is very, very important.
Rashmi: I just want to say thank you to all of the panelists. I think we can end a little bit early, but yeah, thank you for sharing your insights and your experiences. I think it's really valuable for pre-doc to, you know, have as much information as possible before heading into this process. So yeah. Thank you.
How to Apply and Prepare for a Predoc
Moderator: I just wanted to start out by asking everyone, um, what are all of your names, your current titles? Kind of introduce yourselves, uh, where do you work and what predoc program are you all affiliated with? Rebecca, we could start with you, if that's all right?
Rebecca: Absolutely. Hi, everyone. I'm Rebecca Toland. I'm at Yale University. I'm a senior lecturer here in the Department of Economics, and I'm also the director of Research Support at the Tobin Center for Economic Policy. Um, and I'm here representing, uh, the Tobin Center slash Economics Pre-doctoral Fellows Program. I'll put a link to our program in the chat.
Moderator: Maybe price next. What program are you with?
Price: Yeah, I'm Price. I'm currently a pre-doc at MIT Sloan. I guess my official position is technical associate, but it's basically a pre-doctoral program. Nice, nice to meet you all. And excited to be on this panel.
Moderator: Thank you, Bryce. And Elena?
Elena: Hi, everyone. My name's Elena Gupta, and I'm a research assistant at the Federal Reserve Bank of Chicago.
Moderator: Once again, I'm Declan, I'm one of the coordinators, um, and I'm a current research assistant at Columbia Business School. Um, so just to start out with some questions about preparing and applying for pre-docs. Um, where could interested students start looking for available postings? When should they start applying? What was the process like for some of you? This question is kind of geared towards all of you. So in whatever answer you, whatever order you'd like to answer, feel free to speak out.
Price: I guess I can go first. So there are a lot of great resources to find pre-doc opportunities. And I think it's especially true the last few years. I know I relied a lot on the pre-doc website. There's a section called opportunities, they have a bunch of positions from econ, uh, kind of other econ related fields. So that's a great opportunity. I know the NBER has a page where there's positions for both research assistants at NBER and at other places. Then lastly, I think there's a few Twitter pages, um, that post regular updates. And then if you have specific schools or professors you're interested in, you can also look at their pages. So I really relied on those a lot. And in terms of timing, I think I started applying in September, October, but I'd say it really starts to pick up towards maybe like December and, and the beginning of the year.
Elena: Yeah, I definitely agree with everything Price just said. I specifically was also really interested in working at a Federal Reserve bank, so you could go to any of the Federal Reserve Bank websites to see what research positions they have open. Um, and for, I know for the Fed banks, that's usually a little earlier, so September, October is a really good time to start looking into that and applying.
Rebecca: Yeah, maybe I'll just add on to say that at Yale, we usually have our recruitment webpage live by around the end of October, and we have like a rolling admissions process both for applicants and for faculty. So I am constantly adding positions as faculty decide that they wanna hire pred Docs, um, for the next year. So any time is a great time to apply basically between like, I would say September all the way through the spring. Because I think a lot of some programs have like very regimented, like, this is when we post applicant applications, or open the application and review applicants, but really you'll see when you apply, like new positions for various programs are going to keep getting posted from the fall through the spring. So I think getting your application materials together in the late summer to be prepared to apply in the early, starting in the early fall is a great idea. But then keep your eyes open as new positions get posted. So you're ready to apply to positions you're interested in pop up.
Moderator: And kind of building off of that, what materials were generally required for your applications, um, are some more important than others? What did you have to get together to be ready to apply for these?
Price: Yeah, I guess go in the same order. It depends, but I think at a minimum, you're gonna need your CV or resume and try to format it to be kind of specific to the econ or research field. And there's probably resources at your school and other people can help you with that. A lot of positions also require a cover letter, um, where you kind of list your specific interest in that position and, and maybe go a bit more into your experience. And some positions also require a coding sample or possibly even a writing sample. Like a coding sample, you can probably use something, maybe a project you work on, maybe if you did a senior thesis, maybe a class work you did. And writing samples are a little more rare, but I think I submitted a few of those. And I think I used some projects I had written for my economics classes. So those were kind of the resources that I encountered.
Elena: I also encountered the same materials. And I would say for me, the most important ones were the resume and the cover letter. I think I spent most of my time tailoring cover letters specifically to the research institution you wanna be working at. So it's really important to invest in the cover letter and spend time seeing what you value out of a job and what the institution or the bank or anything is looking for. So that's what I would recommend.
Rebecca: Yeah, I agree with everything that Price and Elena just said. I would just add that at Yale, and I think increasingly at other programs, there is either required or optional diversity, equity, inclusion and belonging statement. Ours is optional. And I would encourage applicants to think really broadly about what one could put in that statement. Of course, if you have personal characteristics that could contribute to diversity in a predoc program, you can of course call those out. But you can also think about ways in which your research interests, um, might be able to contribute to diversity, equity, inclusion and belonging, um, in, um, in your program. So to really think about, kind of broadly about how you could contribute, um, to diversity. And I think a lot of programs are really looking for candidates that can contribute to D E I and their programs.
Moderator: Definitely. And also Thomas has joined us now as a panelist as well. I'm curious also, what courses, Rebecca and Thomas do you look for on pre-doc applications and more for Elena in price? What courses did you find to be the most helpful during your experience so far?
Rebecca: I can start. Um, we have a decentralized review process. As the director of the program I actually don't review applications myself. In fact, the faculty supervisors review applications. And I would say that there's actually some variability in what faculty members are looking for. Because what they're, you know, going to be looking for is, are you going to be able to bring some of the skills, some level of baseline skills that's going to be able to contribute to their projects. Um, also with the knowledge that you, they're going to train you when you arrive and you're going to get additional education and training in the program itself. But I would say, math courses are helpful, both for the RA or the pre-doc work. But the other thing that faculty are going to look for is if you do in fact decide to apply to PhD programs in a year or two years are you going to have the level of quantitative training that is going to be necessary to have a successful application?
Rebecca: They're going to want to see that you're kind of coming in with some sort of, um, baseline mathematics level, and that's going to be different for different faculty members. But I would encourage you to take as much math as you are able to take at your institution. You know, at least through multivariate calculus, if you have the opportunity to take linear algebra, especially a proof-based linear algebra class that would be great. And the same if you're able, if you have access and are able to take, um, real analysis or other proof-based math classes, that's wonderful. But certainly not, uh, required. Many of our pre-doc here at Yale are taking proof-based linear algebra, and are taking real analysis while they're here as pre-doc, so don't let that deter you. And then I think there are many other kinds of courses, although not all pre-doc positions are empirical in nature. And so another set of skills that faculty are frequently looking for are coding skills. And any classes that you can take in statistics and data science or computer science where you're going to be trained in coding, um, as part of those courses, I think are helpful when faculty can see when faculty are looking over your transcript as well.
Thomas: This is Thomas Clear. Sorry, I joined late. I had a technology bug that was biting hard today. So I'm in terms of classes and skills. I'd like to second what Rebecca just said and just maybe a couple sentences, cast it in a slightly different light. Some applicants are coming late to the field of economics in the undergraduate program. That's why the question about specific classes is a little tricky, because in our expectations, there's a relatively straightforward answer when it applies to an applicant who's been an econ major. And sort of designed their class program, their class scheduled that way, otherwise, and what Rebecca said is, is right, is right, it's right on.
Thomas: But there are other ways you can connect to a pre-doc or a research assistantship position. It doesn't have to be econ, it could be maybe finance, it could be physics, you know, anything that sort of gets you into quantitative and sort of coding related mindsets. And then the line is very, when you talk about classes, you also wanna talk about skills. That's the way Rebecca answered the question. I would do it exactly the same way. You wanna be interested, at the end of the day, in economic questions, right? Even if you come late to that field. But the kind of skill-set that you need and that you can demonstrate either through classes that you've taken and towards the math and coding side, or through things that you've done practically is about handling data sets, and the code that goes with that. Thank you.
Moderator: Price, and Elena, were there any classes you took beforehand that you found especially useful during your times as an RA?
Price: I of course like agree with everything. The previous panelists said, some classes in particular that were really helpful for me was of course, like the intro to econometrics. And then I also took a econometrics or a few econometrics electives where we worked with datasets in R and I'd say any class, any class where I was using R which is like a popular programming language in econ, or similarly Stata, is super helpful because a lot of the times the coding exercises that are gonna be sent to you are sometimes very similar. But those homeworks or, or projects look like. So that really helped me. And I also agree, like I took some other coding classes, and even though I don't necessarily use those languages in my work, that mindset of problem solving using a coding language is really helpful. So those are my 2 cents, I guess.
Elena: I definitely agree. I would say coding is really important. I use data every single day. So using or getting to know programs like Stata or R are really, I would say, critical to your application. But I also want to make sure like, you guys aren't scared. I know math classes, everyone emphasizes getting to real analysis, I did take a lot of math classes. I was a math double major, and I think it helped on my resume a lot, but does not necessarily impact my day-to-day work. So I think if you can show on your resume or in an interview somehow that you do have critical thinking skills and quantitative skills somewhere else, I think it will help. Um, you maybe make up for that lack. I don't want anyone to be discouraged if they don't have all the math classes and don't afraid to apply.
Rebecca: I just want to build on what Elena said there about not, uh, being worried if you don't have all the math classes. And again, different faculty are going to evaluate transcripts in very different ways. Some are really going to be looking for candidates that have had pretty advanced math training. But if you are not someone that had the opportunity to take those classes, there are likely faculty out there who will still be enthusiastically considering your application. I can give a concrete example of a faculty member here at Yale who hired a pre-doc a few years ago, um, who had only taken, I think, if I remember correctly, ALC two, in college. And she came here to Yale and she took math through analysis, and she is now a PhD student, um, at the Harvard, uh, Kennedy School in Public Policy. So that like, can definitely be you, just be persistent and apply widely and you will find a good fit for yourself.
Moderator: Building off of that, for Rebecca and Thomas, what are some of the skills coding and otherwise you look for on a pre-doc and for price? And Elena, what are some of the skills you found most useful, both coding and outside of hard skills?
Thomas: All right, thanks. Ella, let me, let me start with that one. There's the skillset that is directly related to working with data. Sometimes you get that from a class, like when you can learn how to code, you can get that from a project either inside or outside the classroom. So these are things that we're looking for. Those are also good things to highlight in an application. And then there's sort of related skill sets, because you're not doing research support in isolation, right? You need to communicate and you need to document. The way I usually phrase this to our incoming students, you hardly ever have a project that either begins or either begins and ends with your two years, as a research assistant or as a pre-doc.
Thomas: You stand on the shoulders of others, and then you also leave things behind for other people to be picked up. So it needs to be documented well. So that's an element of communication. And oftentimes, research is an activity where you're not the only research assistant or pre-doc supporting that. There may be others that you need to communicate with across in terms of who does the work, but you certainly need to communicate, even if the only one supporting a specific project, you need to communicate with people on the other side. And that could be multiple individuals, maybe it's one principal investigator, maybe it's a whole group of investigators, and they don't necessarily have to be in the same place. Um, they could be at the University of Chicago and co-authors could be anywhere around the world. They could be at the Federal Reserve Bank of Chicago, and again, co-authors could be anywhere else.
Thomas: So communication is important. That includes asking questions, because in order for you to be effective in your role, right? I mean, let's not forget as a research assistant or as a pre-doc, you are, you're being kept busy with meaningful work, but you're also learning along the way, not just about how to do things, but also about sort of the bigger picture. And that's, that's the sort of the added bonus to communicating well with your peers and with your PIs about why things are happening. You're learning in the process about how to fashion research projects that you would use that skill you're gonna build on later for example, if you decide to go get a graduate degree. So I'm gonna stop right there.
Rebecca: I might turn it over to Price and Elena. Cause I don't think, I think Thomas gave an amazing answer there and I'm not sure that I have more to add.
Moderator: Have there been any coding or non-coding skills you guys have found especially important, price and Elena, during your RA experiences?
Price: Elena, you can go.
Elena: I wasn't sure of the order, but you can go first. We can stick to our order
Price: I touched on it briefly. RN data are the two biggest languages, and it's okay if you know one and not the other. I only knew R and then picked up STATA along the way. This is a pretty niche one, LATEX however you pronounce, it's not a must and it's actually pretty easy to learn, but it's useful because I know at least in my experience, it's used a lot when formatting like tables and stuff, and it looks nice and professional. And reiterating the communication skills and organization skills are really important. Some of the time you're gonna be working in a bit of more of an independent or at least some of the self-guided fashion. And so it's really helpful if you can organize, keep, I think keeping notes is really important to give yourself some structure and set yourself up for…
Rebecca: Success.
Elena: I definitely agree with everything Price just said, and I know Thomas and Price also both emphasize soft skills, so I also just wanna say soft skills are so important. That is like the one thing I guess you use more than coding. I feel like every day is talking to people. And even though research is very independent, I think building relationships with your PI or your economist, or even in the RA position, there are a lot of other research assistants my age. Being able to talk to someone my age and say, Hey, like, can I sit down with you and see what you're doing? And learning from each other is super important.
Moderator: Thank you all. I'm also wondering now… this is really more directed at Rebecca and Thomas. What qualities in an applicant stand out for pre-doc applications, especially coming more to the interview stage kind of post data task?
Rebecca: In this particular position, I don't conduct pre-doc interviews, but I used to be in a different position that I was in. And I would say that one thing that will really stand out is intellectual curiosity about the prospective project that, or projects you were going to work on with the faculty or, or non-faculty supervisors if you're applying to a Fed job that, uh, for example, that you will be working with. But also intellectual curiosity about any coursework or especially research oriented work that you've done previously. If you can… like a typical thing you might get asked in an interview if you like, turned in your senior thesis or another term paper as a writing sample, if one was required for your application, the interviewer might ask you to talk about it and to show the interviewer that you won, understand what you did and can, you know, obviously like communicate, about that in like an intelligent way is always helpful. But just to show that you're excited about it and why you're excited about it. Because potential supervisors are going to want to understand something about your research potential because when you ultimately apply to PhD programs, that's something that admissions committees are going to care about. Um, and so I think intellectual curiosity goes along, goes a long way.
Thomas: I'm going to totally support that. In fact, I've asked that question many times, you know for an applicant doing the first round interview, to give me an elevator pitch on a specific research project that they've been involved in. It's about curiosity and about being able to reflect on what you've done. So, yes. Absolutely.
Moderator: And is there anything in your experience with RAs you wish that your RAs or RAs for faculty members that you've worked with or looked at their applications new before they began a predoc program?
Thomas: Let me, let me start with that one. There's sort of two things that come to mind. Number one is that as a pre-doc or a research assistant, a position does not tie you down to a specific path that you're gonna take afterwards. You do not have to enroll in a PhD program in economics. You're driving the bus. We're providing you an opportunity to learn something about the profession and then the process, build out your skillset, but also in the process, learn something about yourself. And it is possible, and it happens every year that somebody learns that a PhD program is not the right path for them, and that's okay. And that's part of the learning process. You're an apprentice in either role, you know, pre-doc or research assistant. You're an apprentice, so you're supposed to be learning and the answer is not predetermined.
Thomas: And the other element that is often underappreciated, it's maybe the answer to a slightly different question, but I'm gonna bring it up there anyway. What do RAs, what RAs most surprised by when they start their job? And it's sort of about the learning environment, because when you interview, you think about your individual qualifications, and then you also think ahead, is this just the right thing for me? Do I wanna spend two years to take that time away? You know, some of my friends are going to graduate school directly or doing something else, starting a job, starting a career, possibly. Do I take that two year detour? It's all about your individual perspective. And then the most positive surprise, in addition to all the things that come through is about that you're gonna be part of something bigger.
Thomas: From the Federal Reserve Bank perspective, I would describe it as doing your job as a research assistant in an office setting that is so incredibly powerful because it opens up this avenue of learning opportunities that you never thought about when you started the job because you're surrounded by others. You're surrounded by economists that you may not work for directly. And you're surrounded by a number of research assistants that most of them do not work on the project that you're supporting, but they can all learn from one another. And that is one of the most powerful things that happens during the pre-doc and research assistantship, two year, time period. Thank you.
Rebecca: Yeah. May I agree with everything that Thomas just said? I would say something that will help you going into the position is to be enthusiastic about the research assistant's work, no matter what it is or try to be, no matter what it is. Because the reality is in data work that a lot of data work can be kind of repetitive and tedious. Oftentimes the lowest ranking person in the research group ends up doing those sorts of tasks because someone has to do them, right. And so I think if you can, as a pre-doc, if, you know, hopefully your faculty mentors will also like to allow you to contribute to many aspects of the research process and give you tasks that are going to give you intellectual challenge. When you're doing, you know, some tasks that are a little bit more tedious, like, you know, repeating the production of tables and figures until you get them exactly, you know, how your PI wants them to look, that you can keep the bigger picture of what you're doing in mind. And that will kind of help you navigate through some of the days where you're doing more tedious work. Then also when you're looking for positions, to look for positions where you might have access to other things outside of your RA work. So for example, the ability to take or audit courses or to go to research seminars, that your school or organization puts on so that you have some other outlet at work that is beyond the RA work.
Moderator: This question was thinking primarily for Rebecca and Thomas, but I think Price and Elena might also be able to speak to it. Are there any skills you would encourage per perspective RAs to focus on trying to develop before beginning to pre-doc? And that's either coding or soft skills, things they could work on before starting?
Thomas: It's not, I mean at the time when you interview, you know, but, but the time that's left by towards your graduation and when you start the pre-doc or the internship spell, you're not going to materially change your experience, your set of experiences. I mean, that question sometimes comes up during the interview, you know, some specific applicant or somebody after they get the job offers. They ask, what should I be doing to help me start in this job? And basically I think, you know, the recommendation typically is do more of the same, right? I mean, or take the opportunity to do one in your last semester as an undergrad, do something that you really want to do take that class, then you haven't been able to take, you've fulfilled our requirements, you're going to be busy enough, you've showed us in your application, do you have plenty of data skills?
Thomas: If you wanna do more of that, then do more of that then, you know, see if you can land another project or engage in a class as something that gives you another chance to practice your coding skills and are, don't learn a new software for the last quarter or last semester. You don't have to, um, you're deep enough in something you can learn whatever comes your way. And I don't know exactly what kind of projects you're gonna get assigned to when you start. Um, so, um, yeah, I mean, either do more of the same or, or do something that you really want to enjoy. You haven't had a chance to do it because you've been so busy sort of building out your resume and your transcript to land this job and take advantage and, you know, in your last quarter to do something more fun, maybe. Thank you.
Moderator: I think that's great advice. Price and Elena and Rebecca, if you guys don't have anything to add, I was also curious more, especially towards Price and Elena on, what did you do to prepare for pre-doc applications? What did your application process look like for both of you?
Price: I was really lucky that my school had a program where you could actually be mentored by PhD students. In my case I was at the business school, but they were in Econ fields. That kind of got me started on the path to considering a pre-doc. And, uh, one thing that was really helpful is we read a lot of interesting applied papers. I guess having that understanding of what econ research is like and getting a basic familiarity with what papers are like and maybe what research methods are like they use, um, is, was important to me. But I guess more concretely, when I was starting, I kind of reviewed past classes where I had done coding or dating data exercises, and I remember I actually kind of redid some of those homeworks just to like sharpen my skills and kind of get, get a feel for it.
Price: I feel like I applied to a fair amount of positions and I ended up having a fair amount of data tasks I had to do. And so, um, I just got a lot of experience by spending time on the data task. And I think in some instances I probably spent more time than one's recommended. I guess just the process of doing those data tasks and seeing what type questions are asked and like really trying to, maybe if you're not super familiar with that data test, trying to learn while you do it, what learn while you do it was really helpful for me and helped me build skills.
Moderator: Yeah.
Elena: I guess for me, I would say obviously taking all the classes you can about econ and coding, but for the actual application process, I definitely spent a lot of time on my cover letters and resume, like I said earlier, but also practicing interview questions I think was really helpful. My career center at our school, they would definitely go over the most common interview questions and just practicing talking about research was very important. Talking about my research ideas, what paper I was working on for my senior thesis, um, being able to communicate your skills, even if you have coding skills, if you can't communicate that to other people, how are they going to know how good you are at Stata or R or Python or whatever it is. So I think that's really important to practice.
Rebecca: One, this is kind of before you apply, but I would say one thing to keep in mind, especially for those of you or who are first years or sophomores or even juniors is to get to know your faculty members, especially the faculty members that will be able to serve as strong references for pre-doc applications and ultimately for PhD program applications when you apply. Um, so these can, you know, obviously they're the obvious ones. Like if you write a senior thesis, you're a senior thesis advisor, um, but it can also be instructors, um, from classes. And I'm assuming that there are some of you on the line who might be at large schools where it's harder to get to know faculty members. And I would say that even in large classes, you can get to know your faculty members. And what I would recommend doing for those of you in those positions is go to their office hours.
Rebecca: For example, if you're taking undergraduate econometrics and there's like 200 or 300 or even more students in the class, go to the faculty members' office hours, tell them that you're considering applying to PhDs in economics and you're thinking about applying to pre-docs, if it's an economist, they'll probably be super excited to hear that. Another good technique is even when you're outside of, out of that faculty member's class to keep following up with them, let them know, you know, check in every few months or so to let them know what you're doing. You're doing this summer internship in this, you know, what classes are you taking, then when it comes time to ask that faculty member to serve as a reference for you, they'll be able to serve in that capacity in a strong way for you. When you do ask them to serve as a reference, make their job as easy as possible, send them copies of the materials you're submitting with your pre-doc applications. So an example of your co one of the cover letters you submitted, your resume, your transcript, tell them why you're applying to pre-docs. Give them as much information as possible so that they can be a support, as strong of a reference for you as possible.
Moderator: And during your application processes, was there anything you felt that was overemphasized or that you were overly concerned about now having started a pre-doc or kind of having gone through the process you realize just wasn't as important for the job?
Price: I think one piece of advice that I got and, and something I think is really true is, it's not all about where you, like what institution you end up working for. Like it's not all about prestige. It's really important to have a connection with the faculty you're working for. And so maybe if you get to an interview stage, you should not only think about things like, yes, they should know about me. It's also an opportunity for you to learn about them. And if you actually feel like you have a good connection with a faculty member or whoever you're gonna be working for, that's really important because they're gonna end up mentoring you. You wanna work somewhere where you're gonna get something out of it and we're gonna be mentored. Well, um, yeah. So I guess that's, I think that's a piece of advice that maybe gets left out cuz you know, there's so many, like things you feel like you have to do to get, uh, into somewhere, but, um, you also need to think about like, where do I want to end up and what do what I want? What do you wanna get out of this?
Elena: I would say for me, I was really concerned during the application process about knowing a lot of different coding languages. So on my resume I put R and Stata and SAS, even though I feel like I didn't know all of those languages really well. And now that I'm actually working as a RA I'd say it's more important to know one language really well than to know like three not super well, which is what I was doing before. So now that I'm really focused on Stata, I have a way better handle on it…But I don't need to know all three super, super well...
Moderator: Before taking some questions from the audience, any last tips on any other application or preparation, preparation related advice for future RAs from any of you?
Thomas: My apologies for coming late, so I hope this is not redundant. It's about the application schedule and when one should think about getting applications out. The pre-doc website is a wonderful one-stop shopping place, to find out about job openings that are live. The Federal Reserve system is there as well. The placeholder is a website called fed econ jobs.org. Steven Lamb kindly shared the link. The Federal Reserve system is a little funky in the sense that we don't all hire on the same cycle. The board, for example, is large enough, so they have a fall and a spring hiring cycle every year. Then the 12 regional banks probably only do one cycle in Chicago. We start early, New York starts early, I think San Francisco starts early and I don't really have a good sense of how everybody else is lined up. But you can find out through fed econ jobs if the postings are open. It can vary by bank. Unfortunately, there is no common app for the Federal Reserve system. So you could apply to the board and 12 regional banks altogether if you really wanted to go to one of the Federal reserve banks. All right.
Moderator: Well, thank you all so much. I think I'm going to start taking some questions the audience has been asking now, um, and I think this is one all four of you could speak to. When applying for pre-doc, what kind of recommendation letters help the most? Would a data scientist be a better fit than an economist? Where did you kind of go looking for that or where would you look to see recommendations from?
Thomas: I think the letter that works best is the person that can speak to your skillset that we're looking for when you come join us, right? So when I read letters of recommendation, I read about what they say, um, and is it a data scientist or is it an economist? If they can, if they've observed you working with dataset, then that doesn't matter, right? If your letter of recommendation comes from somebody when you worked as a lifeguard at a public pool somewhere in the town where you live, where you grew up, that's gonna address different qualities, but it's not gonna speak to the skillset that are really core to a pre-doc or an a position, an RA position. But recommendation letters are very important in this process.
Rebecca: I would say first best is a faculty member or other recommender who can speak to your research potential. So examples of that would be a senior thesis advisor. If you have the opportunity to do research assistants for a faculty member or other mentor, that person could be a recommender for you. I realize that not everyone has the opportunity to, you know, do those things before applying to a pre-doc. So if that's no problem at all, I think probably the next thing I would go to is exactly what Thomas emphasized as someone who can just speak to your skills in any dimension. So that could be, you know, if you're an econ major, you're an econometrics professor, for example. Which again is like a good reason when you're in those classes to try to go to your professor's office hours and make sure they know who you are because they'll be able to write a stronger, more specific letter if they know something about who you are as a student beyond just the final grade that you got in the course.
Rebecca: And then I just saw that kind of a related question in the q and a, is that, is it a recommendation letter like a super important factor? Different programs are going to have put different emphasis on recommendations or letters versus other aspects of your application. So I think there's no kind of like black and white answer to this, but I would say that like in what I've seen in my experience is oftentimes in a lot of pre-doc applications, instead of requesting a letter, the application will request you to list two to three references. And it's a little bit more like applying to a job. Usually when you are applying to a job, the reference check is like the last thing that is done after the employer evaluates their skills and decides that they're a top candidate, they want to hire you, the last thing they will do is check your references.
Rebecca: That happens a lot in pre-doc positions. So basically what you want is for your references when you know your potential employer reaches out to them, which they'll probably do by sending an email and say, Hey, can you just write me a quick email to let me know what you think of this candidate? Or can we jump on for like a quick 15 minute zoom that they can very quickly and succinctly sum up for your potential employer why the employer should hire you. I think they are important, especially in the latter stages of getting hired as a pre-doc for many positions.
Moderator: Price. And Elena, how did you guys go about seeking recommendations generally for pre-doc applications?
Price: I had the opportunity, opportunity to do a senior thesis. I got, there was a faculty member who, uh, advised all the senior thesis people and then there was another faculty member who would help me on a specific project. So I got to know those faculty members pretty well and they saw my work. And then I'd also done internships for the summer and I didn't work directly in econ research, but I had been working at kind of like policy think tanks. One of the people I had a good experience with saw my work. I asked them to be a reference. I just chose people who kind of I knew and had a good experience with. If it's, maybe even if it's not somebody you did research with, it's if someone who's senior work and kind of can speak to your skills.
Elena: I asked two professors. I worked with one who I was a research assistant for. I think it's really important to ask people early so you give them enough time to write a letter and I think that's more respectful. I think the worst thing you could do is ask a professor or faculty or anyone else you've worked with and say, oh, the deadline's in a week, so just make sure you give them, I would say at least a month is what I went by. But other than that, I agree with what everyone else has said.
Moderator: Another question coming in is Ergon, how many positions would you think about applying to? And obviously part of this might be dependent on if you apply first in the fall cycle, you can apply for more in the spring cycle. Price and Elena, how did you approach that? And Rebecca and Thomas, what would you recommend?
Elena: I can go first. I applied early in the fall cycle and I only applied to five, which I don't, I don't know if that's too little these days, but also once you start getting involved in the process and have a better sense of if it's going well or not, I guess it could influence your decision. But I applied to five, I got two offers and then I chose and I was done. Um, so I didn't feel like I had to apply in the spring cuz I was happy with my offers. So I don't know what price did or what others would recommend, but that's what happened for me.
Price: I applied for a fair amount. I, I don't know the exact number off the top of my head, but it's probably at least, probably at least 15. And I know it seems like a lot, but I think just the first few applications, I mean maybe I didn't approach this the right way, but I treated each application as not only like an interest in the job but like a kind of preparation to do more. Once you get a few done and you know what the cover should look like? What should my resume look like? Maybe what type of coding samples they are and you can kind of judge like, okay, am I getting moved on to the next stage or not? You can kind of judge your success and, and go from there. I guess I just monitored those resources I talked about if I saw something that was interesting, once you kind of know what you need to send, you can kind of start applying to more.
Thomas: It's interesting for me to hear Price’s and Elena's perspective on that. Because we look at this from the other side, you know, and so I can't suggest an ideal number, but I can strongly suggest applying at more than one opening. Okay. Do not put all your eggs in one basket. Occasionally we come across a situation like that and of course I find that out afterwards if somebody accepted the job and then when after he or she starts, we talk about that and it turns out that was the only place she applied or he applied. And I find that surprising. But it happens probably more often than you would think. Choose the number that works for you.
Rebecca: I totally agree with what the rest of you just said. I think that the reality is that there are a lot of people applying to pre-doc positions and in our program, we probably get about 500 applications for about 30 open positions just to give you a sense of numbers. That is in no way to scare you off or deter you. I enthusiastically encourage you to apply to pre-doc positions and many of them because you're just not gonna know ex auntie exactly what's gonna hit in terms of which potential supervisors are gonna be interested in you. And then also like when you're interviewing what positions like you're actually through the interview process gonna realize are gonna be a good fit for you. Maybe something that was maybe mildly interesting to you when you read the position description when you're in the interview, you realize, I'm very interested in this work and would really like to work on this project.
Rebecca: I would say your mental health is very important and you know, I think most of you on this call are in school or have jobs. It's very important for you to continue to do well in school and to have a free time in your life hopefully. Of course you need to think carefully about how much time you're going to allocate to applying to pre-doc programs or other, um, jobs. To the extent you have bandwidth to apply broadly, I would encourage you to do so, which will increase your chance of successfully getting a position and getting a position that's going to be a really good fit for you, where you're going to thrive.
Moderator: Now, I think it looks like we're running up on time now. I want to give a huge thank you to all four of you, Rebecca Thomas Price and Elena, you've given some great advice on preparing for and applying to pre-docs. And I think with that we're going to get ready to hand things off to Rashmi for the next panel. Again, a huge thank you.
Data Task Review
Matthew: All right. Hello everyone. I'm Matthew. I am an RA at the Chicago Fed and I'll be going over the data data task today. Starting off the data task is doing some resource level output data from the Electricity Reliability Council of Texas. And the first thing that should be done is to import the ERCOT resource output CSV. So simply, all right, and there are five variables in here. The Q S E, the timestamp net output resource status center resource name. The first question is to ask how many unique values does the variable resource name take in the data and the variable Q S E In order to do that, I'll be using these data package, um, unique. This is a very simple package that will allow us to answer all of the questions one through three. So quite simply can use the command unique resource name. Uh, yes, I can try and zoom in. Is that better? Right? You can use unique resource names and unique Q S E and we see that there are 11 1,121 unique resource names and 194 unique QEs. I think in order to facilitate the rest this I should share my entire window. Give me one second.
Matthew: Right. This should make things easier. So question number two, ask what is a QSE? Do a quick online search for this acronym and provide a brief definition for QSE as used in Ercot market for electricity. I think the most straightforward thing to do here is just a simple Google search. Uh, it came up first hit for me either and outright here, a QSE or qualified scheduling entity submits bids and offers on behalf of resource entities or load servicing entities such as retail electric providers. So they, so my understanding is that there's this market in Texas for electricity and these QSEs bid on behalf of resource providers.
Matthew: Now moving on to question number three, find the set of unique QSE and resource name pairs. Just like we did for question one, we can simply use the unique resource name QSE and get the number of unique values. There are 1,127. So that should indicate that a single resource name is paired with more than one QSE in the data. It definitely means that each QSE corresponds to multiple resource names. So that means each E obviously serves multiple different resource entities. So to answer parts A and B, we can use the unique package again and we can go unique, unique resource name by Q S E and then generate a new variable called UM, resources. And this will give us for each Q S E, the number of resource names that it's paired to.
Matthew: We can tabulate this new variable number of resources. And we can see that there are 64 QSE that are paired with only one resource. We have these 10 right here that are paired with from 24 to 173 QSE. These are the 10 largest QSE in terms of the number of unique resource names that they're paired to. For part B, we're going to do what we did before, but just the reverse by resource name. And we're going to generate a new variable, QCs. I'm doing this quietly because otherwise it would put a bunch of very non-helpful and annoying spam on the screen that I don't think anyone really wants to see. So now we can tab and we see that the vast majority of the resources are paired with exactly one Q S E and six are paired with two. Why might a single resource name pair with multiple QSCs in the data? And it gives us a hint to look at how the pairs change over time. So simply we can sort and now we can browse if the number of QCs for a given resource is greater than one. So this will only get these six QCs with multiple. Now on the side of my screen, we've opened up the brow window and if we scroll down…
Matthew: In fact we can just, so this is one of the resource names that is paired with multiple and at some point in this data, they will switch QSEs right here, so on. So they switched at some point in the data and basically this is some sort of a market and they decided at this point in time that it would be more efficient for them to work with a different QSE. And just for you n r, good practice save this output. Now on to question number four. So how many unique non missing values does resource type take using this new resource type CSV file, we can import it. I cleared the previous set of data so there are only two variables here. We have resource name, resource type, there are, so we can use the code book, command resource type and this will give us some examples. And it says that there are four missing values. We can browse…
Matthew: Resource type. This is going to let us just look at the four resource names who don't cover resource type one of them. We see two solar companies and two wind companies. So it should be fairly straightforward to guess the corresponding resource type for each of these two resource names. But there is kind of one thing we need to be a little clever about. There isn't exact, there isn't an exact solar variable. So if we tab resource type, we see that solar is not listed as solar. In fact this PV GR photo IC generation resource is the solar variable. You can find it right here, wind and solar and PVGR. So what we're going to do is replace for Galloway Solar and Roland Solar, Roseland Solar, the PVGR resource type, and for suite w N two and Aspero, the wind resource type.
Matthew: And we will save our data.
Matthew: For question number five, we are going to go back to resource output data and we're going to merge the resource using the, we're going to merge onto this re on the resource types five we were just using with respect to the variable resource name. So we know that there are duplicates with respect to resource name. This data, they are not in the other data. This is a many to one merge. So we will merge nine to one resource name using Ercot resource types, which is the file we just saved. And unsurprisingly, everything matches correctly and drop the merge variable. So now we need to generate this fuel type variable using definitions provided to us. I'm not going to type all of this out, uh, I'm just going to copy paste this code over. So to make things a little less ugly, I'm using this enlist command and fuel type. We have large number of, so large number of solar winds, natural gas, small number of other providers and coal providers and a very small fraction of nuclear providers of fuel types.
The next part of this exercise is using STATA’s time series functionality. This is something that, you know, you kind of just have to put in the metaphorical blood, sweat, and tears to learn. It's not hugely intuitive when you're first starting out, but it's very easy to work with once you figure things out. So if you have any questions, please use the q and a. This part is maybe a little more tricky than the previous parts. What we're going to be doing is plotting output some by day output, some by hour of day and output some by hour of day cross fuel type. So in order to do that, we need to manipulate this date variable that is in our data right now, this SC ed timestamp. What we're going to ultimately want is a variable that is just the day. Then we want another variable that is the day plus hour.
Matthew: First I'm going to generate this state of date variable and I'm going to use the state command as input to the S C D timestamp. Then I tell STATA the format of this date string variable. In this case, it's month, day, year, month, day, year. And these two pound signs hashtags are to tell data to ignore anything that comes after month, day, year. To make things right now we have these numbers, which is the number of days post January 1st, 1960. This is the way that data calculates its states and times with respect to its beginning of time in 1960. But in order to make this legible to a human being as opposed to a computer we'll use this time date for, and now we can see that this is actually readable by a human being. So the next thing that we want to do is sum this to be able to sum this with respect to our, and as you can see, we have observations every 15 minutes. So just to fix ideas…
Matthew: See we have observations every 15 minutes. We want to do it every hour. So what we're going to do is take the SUBSTRING of S C E D S C E D timestamp and we're going to take the first 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 spaces, 11, 12, 13 characters. To do that we use the substring command from one starting at one and the first 13 characters starting from one.
Matthew: This is going to give us our day plus hour in a single neat string. Now we are going to generate a second version of the timestamp that is this sub string with colon zero zero afterwards. We can just simply sum these like we would sum a normal variable data allows you to do that, generate the second timestamp, this current substring plus colon zero zero. So now we have everything rounded, quote unquote rounded to the nearest hour and we are going to generate this date hour variable. So very importantly for most variables, you don't need to specify what type of variable it is this time we do need to generate it as a double and I'll show why that's important in a second. We're going to use the clock function. Similar to before we use month, day, year, space, hours, minutes to tell data what this is, you know how to read this string. And you'll see again, very large numbers. I think this is the number of hours post 1960 or number of seconds post 1960. And what we will do is format that date and we'll use C format. And this gets us our date plus our time. So to show you why this uh, tree, why this double is important, I'll remove it and I'll call this new variable date hours data bad and we will format date our data bad.
Matthew: As you can see, this isn't exactly what we want. There's something that stater does wrong. When you don't specify double, it ends up adding seconds where seconds don't need to be added. For example, right here, we want this to be 1:00 AM instead it's 1259 and 56 seconds. I don't know exactly why this happens. For future reference put underscore bad there. So we'll drop this STATA hour — STATA, a bad variable. And one last thing we need to do is extract the hour from this date hour data variable. We can do that very easily using the HH command. You can do this with day, year, month, minutes for whatever reason if you need seconds as well.
Matthew: As you can see we have hours. This is military time from one to one to 23, well from zero to 23. Before we get on to the bulk of question six, are there any questions about how I manipulated the string to get uh, state of formatted dates and times? If not, I will keep on going. So what we need to do next is sum the output by day and plot it. The easiest way in my opinion to do that is to just preserve the data. This is going to have lock a version of the data in place as it looks right now. Now we can use the collapse command telling data to sum instead of take a mean this telemeter net output by States.So what this says is to sum this telemeter output net variable across date. So this is going to get the total amount of output across all resource types and QSCs each day.
Matthew: As you can see, if we bring the data editor back, we have a nice collection of 32 different days and we have the total output I guess in watts. To plot it, we can simply use two-way line. I'm just gonna copy paste it and we have it right here. It looks like the data, there isn't an obvious upward or downward trend here. There's obviously a lot of cyclicality. It looks like there's a massive amount of output used right here. Someone can correct me if I am wrong. I think there was some sort of a snowstorm or cold snap here which caused the electricity output to be much higher than this week. But there isn't an obvious upward or downward trend beyond that.
Matthew: As you can see, if we bring the data editor back, we have a nice collection of 32 different days and we have the total output I guess in watts. To plot it, we can simply use two-way line. I'm just gonna copy paste it and we have it right here. It looks like the data, there isn't an obvious upward or downward trend here. There's obviously a lot of cyclicality. It looks like there's a massive amount of output used right here. Someone can correct me if I am wrong. I think there was some sort of a snowstorm or cold snap here which caused the electricity output to be much higher than this week. But there isn't an obvious upward or downward trend beyond that.
Matthew: Now we can save it just using the graphics sport commands because I think we will need it later. I exit. Okay. And we can restore. We'll need this graph later, so I saved it. Question six B, we're going to take the output sound by hour of day. That means basically looking at the patterns across time of day. What we'll do is serve and similar to before, we collapse this time with respect to hour instead of with respect today. If we go into the data editor, we have each hour and we have across the entire sample the total amount of output in that hour.
Matthew: Now we can plot the same idea as before using the two-way command. And somewhat unsurprisingly we see very little electricity usage during the middle of the night. It peaks in mid to late morning locally peaks again in the early evening and then falls off again once people go to sleep and once the sun goes down. Just for consistency, now save and export this graph and then we can go back to our original data by using our store. So six C I think is maybe a little trickier because of how our data is formatted. We spend a little bit more time there. We're going to preserve the collapse with respect to two variables this time, hour and fuel types. So for each of this six fuel types right here, natural gas, wind, other solar, nuclear and coal, we are going to some across time of day and just to make things easier for us later cause it's easier to work with numbers than strings, we're going to encode fuel type variable, generate a new variable fuel type. For now, we can drop the fuel type.
Matthew: Now, let's open up our data editor. What we ultimately want to plot is for each one of these fuel types from zero to 23, the amount of output right now our data isn't particularly conducive to that so we're going to have to reshape it. Right now, our data is in what data calls a long format. We need to reshape it wide. Basically we want telomere net output, one for coal, two for natural gas, three for nuclear, four of her other five for solar, six for wind. We're going to want 23 rows instead of 144 rows. So in order to do that we can use data's reshape command.
Matthew: Our data is currently long, we want it wide to reshape wide tel metered net output. And now we need our I variable and our J variable. Our I variable is the variable. We want to kind of be the row identifier. In this case that's our, and then our J variable is the variable. We want to be our column identifier and that is fuel type, no. This is going to give us exactly what we want. Now we have 2324 observations for each type and now we can label them. We're labeling one as coal, two is natural gas, two is nuclear, four is other, five is solar, and six is wind. Now we can plot these very easily. So to the two-way command, if you include parentheses, type of plot, Y variable, X variable, you can include as many different lines as you want on a single plot. I'm just going to copy this over and then I will discuss.
Matthew: Now we have output by fuel type and time of day and we have some identifiers to make it easier to look at for people who maybe aren't looking at it who maybe print it out and aren't using a color printer. We have a very kind of interesting pattern with solar. It peaks from the sun's out and then falls to nothing basically when the sun's not out. Natural gas and somewhat surprisingly wind have a lot of production and coal and nuclear somewhere in the middle and others have very little production. So just to look at the code very briefly, I used the two-way command line Y variable, X variable and then I told it to recast it as connected. We have all these dots and they're connected and I told them which symbol to use to make it a little more aesthetically pleasing or easy to read and we'll save it again and restore.
Matthew: Now, we are on to question number seven. Looking at the plot from six A, does this data look stationary using the data to sum up a daily level test for a unit route and interpret the result, calculate its first difference and plot it? Does that look stationary? So we'll open up the graph from part six A and again, so stationary, is there any sort of consistent trend across the data? Is it going up? Is it going down? Not really. So there is a lot of seasonality here. Uh, you know, on weekends it looks like it peaks and there's a trough afterwards.
Matthew: To test for a unit route, we can use the Dickie Fuller test. In order to use that, we need to once again use preserve and our store to collapse with respect to date. Now we are back to this data and since we're actually going to be doing things in the time series dimension beyond plotting, we can use this ts set command stage data and it automatically recognizes that there is one day gap in between each of the observations. The sticky fuller test is the standard way of testing whether or not there's a unit root.
Matthew: We can see, we fail, we fail to reject the null hypothesis that there's a unit root. There is reason to be concerned that we have a unit root in this data and therefore we can't do our standard um, auto aggressive or moving average models without being concerned that uh, there'll be issues with the standard errors and or coefficient estimates. So the traditional way to fix this problem is to do a first difference. So STATA is very good at this. Uh, you know, there are other ways to do it using you know, uh, and and and minus one, but that's a lot uglier than using just this D one or first difference with respect to telomere net output.
Matthew: As you can see there's nothing here in time one, but there is a difference for every other period in time. And we can run the Dickie Fuller test one more time and see that at the 5% level and almost at the 1% level we can reject a null hypothesis that there's a unit route. So that means we can start to do our kind of more serious econometric analysis. So to plot it, we're going to use the TS line command. Since we have time series data, it's a little simpler than a two-way line. We only have to put this y variable data to know that the date is the X variable.
Matthew: If we compare, this obviously looks stationary or this looks stationary, this doesn't. So this state is much safer to work with without being worried about standard errors and we're export it and then restore whatever. Okay, so now we want to work with the data at an hour of day at the hourly level. So that means each, so that means our unit OB of observation is going to be a day plus hour. So that means each hour in our data is going to be observed once. So this is why we generated that date hour variable preserve. And we're going to collapse in some with respect to this date hour STATA variable that we generated before.
Matthew: So we have one observation for each hour and each day and we have the total amount of output and we're going to use the ts set command again this time telling it that there is a change of one hour just to make sure nothing weird happens. And so first thing we should probably do is plot the data just to look at it. Just good practice. And this data looks mostly stationary. It doesn't look like there's any serious trends. It does look like an autoregressive model is a good fit. There tends to be a lot of auto aggressive models that are a good fit. So to test this more formally we can use something called a partial autocorrelation function and very nicely can generate that for us. So using the p c command, partial autocorrelation, telomere net output. So the partial autocorrelation function tells us at which lags we would probably want to, uh, it tells us how many lags we want to include in an AR model. So as we can see we have 1, 2, 3 lags that are statistically significant at the 5% level, which means that we definitely want to include these lags. Furthermore, we see some additional significant lags at 12 hours and 24 hours, which indicates there might be some amount of seasonality that we want to correct for or incorporate into our model by using some sort of seasonal autoregressive model.
Matthew: So just good practice, I'll export this graph as well and we will run the sticky fuller test and once again we can reject stationary, we can reject the hypothesis that there's a unit root at the 5% level. So now we're going to run this AR one model AR three model, sorry, in Stata. And in order to output this in a neat way without having to copy paste or you know, manually uh, type coefficients into a table that we make ourselves, we're going to use the ESTO package. And this is a pretty standard and very helpful package for saving coefficient estimates in Stata. So just as good practice, clear the current clear and currently stored estimations. And so what we're going to do is store the output to the following regression progress: our output on the first three lags of telemeter net output. So again, we could generate separate variables, but STATA is able to interpret this syntax. L meaning lag one slash three, meaning lags one to three. If we only wanted specific lags, we would do one space three. And we use robust standard errors as is standard.
Matthew: So what we see is that there are unsurprisingly, as we saw in our auto co partial autocorrelation function, we see statistically significant coefficients at the epsilon level. And so we have a constant, uh, you know, I, we could demean this data before running this analysis and we would have no constant. And as you can see we store estimate one and now we are going to use S tab to output this to a latex table. So s tab using this file name and we're including standard errors including R squared, we're doing it in latex and we have the normal single double and triple stars for significance at the ten five and 1% level, which respectively. So this is going to get written to a latex file. We can copy this into any sort of tech file. You know when an economist that you're working with asks for your regression output rather than throwing it, copying it coefficient by coefficient in the latex or word or excel, we can use S tab instead. All right, so I also include in this due file, which I'm fine with sharing as long as the other hosts are fine with me sharing. Uh, I run an arima process to possibly account for this, uh, seasonality. I'm not going to run it right now. It takes about 30 seconds to run.
Matthew: And yeah, it's not necessary for this task, but this model does account for the stationary. Does account for the seasonality. Sorry. So moving on to question number nine, the final question. So we're going to restore and we get back to our master data set, run the following dummy variable regressions and interpret the coefficients output regress on a set of indicator variables for each fuel type each day of the week and each week in the data. And it asks us to discuss a little bit about the coefficients that we find. So my interpretation of this is to not sum over anything and kind of just run these regressions with the data as is and therefore the coefficients that we get out since these models have basically intercept terms only for each, uh, for each, uh, dummy variable in the dataset or yeah, dummy variable in our data. It's basically just going to be the mean of each day of each category.
Matthew: Once again, I'm going to encode this fuel type, generating a fuel type number variable. This is going to let us generate dummy variable is when we run our regression. Next I'm going to use the day of week function to generate the day of week. So another way to do this would be, you know, using some sort of modular arithmetic where you, you know, divide the number of days by seven, take, you know, divide the day, divide by seven, take the remainder and that's an indicator variable for the day of the week. Data can do this itself using the date data variable generated earlier and we do the same thing with week. So again, this is why we use the state of date time functionality instead of using our own more rough dates and times. It's a little so syntactically earlier it was a little annoying, but it makes manipulating these dates and times later on much easier and is generally worth the trade off. So if we wanna look at our data very quickly, so zero is Sunday, one is Monday, et cetera, et cetera. All the way through six, which is Saturday week is the week of the year. If we tab week, we have weeks 4, 5, 6, 7, and eight in this year or in the year 2023.
Matthew: Right. So I'm going to use a four loop to generate this last to generate these last three regressions. So let me move this, make it a little bigger so I can discuss.
Matthew: All right, can everyone see this loop? Okay, so for each variable in this variable to indicate, so fuel type day, a week and week. So we're going to clear our stored regressions and we're going to regress output on I dot each one of these three variables. So IDOT means we generate an indicator variable for n minus one of the n uh, le n realizations of day of week, week and fuel type in order to avoid the dummy variable trap as a standard. And again, we could generate this ourselves, but it's unnecessary data that can do it for us and why not take advantage of that. And then we're going to use the S tab to put it in a latex file to, you know, make putting it somewhere else later on a little easier and we'll just....
Matthew: Okay? So as we can see we have natural gas, nuclear, other solar, wind and constant. So that means coal is our baseline. So this right here should be the mean of coal and we can say some to verify telemeter net output if fuel type.
Matthew: And 223 223. So each of these coefficients are there for the mean relative to the mean of coal. So that means the mean of wind is around 40 means of solar around 12 mean of nuclear, the other around two means of nuclear around 600 and the mean of natural gas around 70. So this might be a little bit surprising if we compare to one of the graph, this graph we made earlier. Nuclear has had a kind of medium amount of output and here it has the greatest means. So what that indicates is that there were very few nuclear providers that each provide a very large amount of electricity to Texas and taking something like natural gas, there are a very large amount of providers and they each provide a relatively small amount of uh, electricity.
Matthew: So going on to day, week and zero is Sunday. So this means this should be the mean on Sunday and we see significantly more, uh, output per significantly more output on non- Sunday days of the week. That's not hugely surprising. Maybe some of them, you know, take Sundays off. Some of them may produce less electricity because people are resting and businesses aren't open. Not hugely surprising. And we see the largest increase during the middle of the week. Again, not hugely shocking, more people are in the office those days. And finally we have a day of the week. So this is week four, so this is the week of January 22nd, 29th, February 5th, 12th, and 19th. So what we see is higher, we see more output used in this week five. This is the week that I was mentioning earlier where I think there was some sort of cold snap. So this is week 1, 2, 3, 4, 5. So again, some sort of cord snap here, the mean went up pretty significantly and be, you know, we have a massive number of observations. So every variable and we're just calculating means. So every variable kind of by definition is going to be significant at the epsilon level. So I think that is all I have. So are there any questions in the chat about anything I did that I can, you know, maybe elaborate on or any general questions about this data task that I maybe didn't cover?
Matthew: Oh yeah, yes. So yeah, so we'll go back to, yeah, so this collapse function I used quite a few times and what it does normally is it would, so it's, it's something we can use to generate summary statistics across a, you know, when we have this very long dataset with lots of observations, it allows us to very easily get summary statistics with respect to a certain categorical variable. So in this case, we want a sum of this output variable with respect to let's say our and fuel type. So we could very easily, you know, not do that and instead do, so this to the baseline is mean.
Matthew: And so this should be the mean of each one of these outputs with respect to our, so, so this is something that's, you know, very useful for generating summary statistics. There are lots of other commands that we can use besides some and the baseline, which is mean, I assume you can use standard deviation as well max min. So it's very useful for summary statistics in an environment where we have lots of observations and we maybe want to do some sort of analysis with the collapsed data because otherwise we could just use some variable if x, y, Z condition is met. But if we want to do it with respect to a lot of different categories and maybe do some time series analysis afterwards, it's best to use this collapse. What resources would I recommend to learn or prepare for data cleaning and data? You know, I think the best thing you can do is, you know, go to a lot of workshops like these where you have the opportunity to do data tasks. Uh, hopefully, you know, you'll do some work in your econometrics or statistics classes using Stata and or are, you know, I think, uh, yeah David, uh, if you want to also chime in.
Matthew: Sure. Yeah, I mean, so a lot of your classes should kind of teach you how to do some of this through your assignments. There are sessions like these that are very helpful. Uh, once you get to kind of a level where you're more comfortable with basic things like done here, you can maybe try to, you know, work on replicating simple papers that you're interested in that have good replication packages. You know, the AEA has, I think mandates replication packages for most of their papers. So if there's a relatively simple paper that you're interested in, you can work to replicate their data, kind of try and rewrite their code, make sure their analysis is sound, etc.. Hope that helps.