Coming Out of My Shell

The reason I love all the data science and programming techniques I’ve been learning is that it feels like magic. There’s a task that I have to do over and over again. I write a few lines of code. I never have to do it again. And everything I learn ends up helping me somewhere, often in ways I can’t anticipate. It’s like learning a language when you’re living abroad. As soon as you know a new word, you hear it everywhere, and a little piece of the world is suddenly unlocked.

Read More

Bad Birthdays Are Killing Me!

Every time we do a round of placement tests at USC, we upload the results to our student information system (SIS) as a fixed-width text file. In the past, that file was created manually in Excel, so one of the first things I did when I arrived was create a script to automate that process. It’s definitely made things a lot quicker, but there’s still a problem: bad birthdays.

Read More

Aggregating Spanish Placement Exams

This week, I began exploring our backlog of language placement exams. I think the best way to talk about his is to walk you through the process of answering a sample question. For instance, how many students have taken our Spanish exam since we started collecting data?

Read More

GitHub Repository

I’ve created a GitHub repository for the scripts that I’m writing for my work in the Language Center. You can access it here. So far, there are two scripts: one for assembling test results into a fixed-width text file, and one for turning a fixed-width text file into a pandas DataFrame.

Read More

Importing Fixed-width Text Files

In my previous post, I discussed the process of converting the results from our language placement test into a fixed-width text file that’s compatible with our student information system. But what about going in the other direction? We have years worth of data in text files, and they’re ripe for analysis!

Read More

Combining CSV Files with Glob

An important part of my job at the USC Language Center is administering placement tests and making the results available to students, advisors, and other administrators. Several times during the year, students take our tests using Scantron forms, and I end up with several CSV files — one for each of the languages we offer. I then need to make sure that all those results end up in a single, fixed-width text file that’s compatible with the university’s student information system. It’s one of those data management tasks that are perfect for automation with python.

Read More