Skip to main content

The birth of SpongeGuyParkFeld

I joined my college's data science club this quarter.  It's very fun so far, though I haven't done much more than pick a project and join a team.

My team's project (well, we're one big group but split into two teams for easier management) is natural-language processing.  Specifically, sentiment analysis of transcripts of TV shows.  The goal is to map character relationships, vocabulary shifts over time or due to new writers, and vocabulary richness based on target audiences.  And stuff like that.  We're planning on refining the goals more as we progress in the project, but that's the general direction.

The shows we have chosen to analyze are Spongebob, Family Guy, South Park, and Seinfeld.  We sort of mashed all the names together to create a single word to represent our project.  Hence, the birth of SpongeGuyParkFeld!
It's a memorable name, and will definitely raise a few eyebrows.  Hopefully in a good way.

I'm excited to do some 'real' computational linguistics.  I've used Python and R before, as well as Beautiful Soup to scrape things.  But not in this capacity, where's it's mostly self-directed and open-ended.  And automating the scraping of an entire website rather than a single page will probably require some stuff I've never done before.

Another thing that's new to me is using git with multiple people.  I have my own GitHub, use it semi-regularly, and I'm comfortable with managing it with Sourcetree.  But I also mess up commits and organization.  So I hope that I won't mess up my entire group's project somehow.

I think I'll have to learn as I go, and ask a lot of questions.  Only two people in my group have used GitHub with multiple contributors before though, so I'll be learning along with everyone else.  Hopefully they don't mess it up either!

The first part of the project is just going to be webscraping and cleaning datasets.  Transcripts of the target shows are the best ones we can find, but there's still some problems and inconsistencies with them.

I don't know why I didn't join this club earlier, honestly.  It's a great experience and hand-ons experience.  The club is also a great way to network with people from more technical majors and different interests.  My group has a statistics major, two physics majors, a couple computer science people, and me, a linguistics major.

I'm excited to start working on this project.  Every aspect of it will strengthen my skills and challenge me.  Especially the teamwork aspect of it.  People are sometimes more frustrating than code.

Comments

Popular posts from this blog

I bought a Silver Reed SK10 and SR10 knitting machine!

I've been considering a CSM (circular sock machine) for a while, but my budget has kept me from buying one.  I really want the ability to crank out sock snakes and hand-finish the heels and toes.  I discounted flat-bed machines because how would you knit in the round on a flat-bed machine?   And then I stumbled on a post that said you can in fact knit in the round, or, knit tubular as machine knitters are more likely to say, on a flat-bed machine.  The secret?  A ribbing attachment!  (Or a double bed machine, but those are fairly rare).  So I started the hunt for a knitting machine and ribbing attachment.  I found the perfect machine for me a few weeks ago.  Here it is knitting a ribbed cowl :

Renulek's Rose #3

 Finished product first, of course.      

First (ish) roll from my new to me Kalimar SR200

A few weeks ago, I got a Helios 44 lens, which happened to be attached to a Kalimar SR200/Zenit E camera.  The seller wasn't sure the camera worked, but mechanically, it seemed to function.  The light meter even seemed to work!   The first roll I shot in the camera came out totally blank.  Which was disappointing.  I think the issue is the film holder is very fiddly, either from wear or age.  But I'm happy to report that the second roll was totally fine!  I got 32 photos, despite the back being opened once accidentally during rewinding...