2015 AAAS in San Jose

13 Feb 2015

I’m at the 2015 AAAS meeting in San Jose, California. This is definitely not my typical meeting: too big, too broad, and I hardly know anyone here. But here’s a quick (ha ha; yah sure) summary of the meeting so far.

Opening night

Gerald Fink gave the President’s Address last night. He’s the AAAS President, so I guess that’s appropriate. But after five minutes of really lame simplistic crap (for example, he said something like, “A single picture can destroy our known understanding of the universe,” like innovation and improving our understanding is a bad thing), I left.

Oh, and before that: the emcee of the evening, who introduced Janet Napolitano, totally couldn’t pronounce her last name. (Her remarks, particularly her comments in support of public universities, were quite powerful.) Old dude: practice such things! Your ineptness reveals that you haven’t paid proper attention to her.

Sightings

A huge meeting, but I know next to no one here. But I ran into Sanjay Shete in the exhibit hall, where I attempted to get two of every tchotchke. (My kids will pitch a fit if one gets something, no matter how lame the thing, and the other doesn’t.) Sanjay was named AAAS Fellow, that’s why he’s here.

I also ran into Steve Goodman (not the folk singer who died too young, but a singer, nevertheless). Gotta love Steve Goodman! He produced Behind the tan door.

Highlights

I went to a dozen talks. A half-dozen I really liked.

Alan Aspuru-Guzik talked about how to find (and visualize) useful organic molecules among the 1060 (or 10180?) possible. Cool high-throughput computing and interactive graphics to produce better solar panels (particularly for developing countries) and huge batteries to store wind- and solar-based power.

Russ Altman talked about how to search databases, web-search histories, and social media, to identify pairs of drugs that, together, give bad (or good) side effects that wouldn’t be predicted from their on-their-own side effects.

David Altshuler had a hilarious outline slide for his talk, but the rest was really awesome. A key point: to develop precision medicine will require hard work and there’s no magic bullet. And basic (not just translational) research is critical: we can’t make a medicine that gets to the precise cause (and that’s what precision medicine is about) if we don’t understand that basic biology.

I gave a talk myself, in a session on visualization of biomedical data, but it was definitely not the best talk in the session, nor the second best. Mine might have been the worst of the five talks in the session. But that’s okay; I think I did fine. It’s just that Sean Hanlon (brother of my UW–Madison colleague, Bret Hanlon) put together a superb, but thinly-attended, session.

Miriah Meyer’s was my favorite talk of the day. She develops visualization tools to help scientists make sense of their data. And her approach is much like mine: specific solutions to specific data and questions. She talked about MulteeSum, PathLine, and MizBee. Favorite quote: “It’s amazing how much people like circles these days.”

Frederick Streitz from Lawrence Livermore National Lab talked about simulating and visualizing the electrophysiology of the human heart at super-high resolution using a frigging huge cluster, with 1.5 million cores. I loved his analogies: if you are painting your house, having a friend or two over to help will reduce the time by the expected factor, but having 1000 friends or 100k friends to help? In parallel computing, you need to rethink what you’ll use the computers for.

His second analogy: The DOE cluster at Livermore is 100k times a desktop computer. That’s like the difference between PacMan (1980, 2.1 megaFLOPS) to Assassin’s Creed (2011, 260 GigaFLOPS). And their cluster is 100k times that.

At the end of the day, Daphne Koller talked about Coursera. She’s awesome; Coursera’s awesome; I’m a crappy teacher. That’s my thinking at the moment, anyway. (A video of her talk is online. Have I mentioned how much I hate it when people screw up the aspect ratio? It seems like they screwed up the aspect ratio.) University faculty exist to help people, and with Coursera and other MOOCs, we can help a lot of people. Key lessons: the value of peer grading (for learning), not being constrained by the classroom or the 60-min format, ability to explore possible teaching innovations, and just having a hugely broad reach.

I don’t think I’d heard the quote that Daphne mentioned, attributed to Edwin Emery Slosson:

College is a place where a professor’s lecture notes go straight to the students’ lecture notes, without passing through the brains of either.

Still laughing!

Food

As I mentioned on twitter, I’ve eaten a lot of tacos. But I also had some donuts.

Boy, am I old

I seem to be staying at the same hotel as the American Junior Academy of Sciences (AJAS). Are these high school or college students? Man, do I feel old.

My contribution to education, today: if all of the elevators going down are too packed to accept passengers, press the up button and ride it up and then down. (Later I learned, from one of the AJAS youth, that the “alarm will sound” sign at the bottom of the stairs is a lie. You can take the stairs.)

Initial steps towards reproducible research

4 Dec 2014

In anticipation of next week’s Reproducible Science Hackathon at NESCent, I was thinking about Christie Bahlai’s post on “Baby steps for the open-curious.”

Moving from Ye Olde Standard Computational Science Practice to a fully reproducible workflow seems a monumental task, but partially reproducible is better than not-at-all reproducible, and it’d be good to give people some advice on how to get started – to encourage them to get started.

So, I spent some time today writing another of my minimal tutorials, on initial steps towards reproducible research.

It’s a bit rough, and it could really use some examples, but it helped me to get my thoughts together for the Hackathon and hopefully will be useful to people (and something to build upon).

The value of thesis intro/discussion

3 Dec 2014

Last week, Kelly Weinersmith tweeted:

Is any task a more monumental waste of time than writing an introduction and discussion for a dissertation where the chapters are published?

I think many (or most?) of my colleagues would agree with her. The research and the papers are the important things, and theses are hardly read. Why spend time writing chapters that won’t be read?

My response was:

Intro & disc of thesis get the student to think about the broader context of their work.

I’d like to expand on that just a bit.

In the old days, a PhD dissertation was more of a monograph. The new style is to have three or so papers (published or ready-to-submit) as chapters, sandwiched between introductory and discussion chapters. Those intro and discussion chapters are sometimes quite thin. I would prefer them to be more substantial.

The focus on papers is a good thing, as they will be easier to find and more widely read. But a thesis/dissertation is not just a research product, but also a vehicle to get a student to think more deeply and broadly.

The individual papers will include introductory and discussion sections, but journal articles tend to be aimed towards a relatively narrow and specialized audience. More substantive introductory and discussion chapters can help to make the work accessible to a broader audience. They also help to tie the separate papers together: what is the larger scientific context, and how do these pieces of work fit into that?

I don’t want students wasting time on “busy work,” and writing a thesis does seem like busy work. But I think a thesis deserves more than a ten-paragraph introduction. And the value of that introduction is not so much in demonstrating the student’s knowledge, but in being part of the development of that knowledge.

Car crash stats revisited: My measurement errors

3 Nov 2014

Last week, I created revised versions of graphs of car crash statistics by state (including an interactive version), from a post by Mona Chalabi at 538.

Since I was working on those at the last minute in the middle of the night, to be included as an example in a lecture on creating effective figures and tables, I just read the data off printed versions of the bar charts, using a ruler.

I later emailed Mona Chalabi, and she and Andrew Flowers quickly posted the data to github.com/fivethirtyeight/data. (That repository has a lot of interesting data, and if you see data at 538 that you’re interested in, just ask them!)

I was curious to look at how I’d done with my measurements and data entry. Here’s a plot of my percent errors:

Percent measurement errors in Karl's car crash stats

Not too bad, really. Here are the biggest problems:

  • Mississippi, non-distracted: off by 6%, but that corresponded to 0.5 mm.
  • Rhode Island and Ohio, speeding: off by 40 and 35%, respectively. I’d written down 8 and 9 mm rather than 13 and 14 mm.
  • Maine and Indiana, alcohol: wrote 15.5 and 14.5 mm, but typed 13.5 and 13 mm. In the former, I think I just misinterpreted my writing; in the latter, I think I wrote the number for the state below (Iowa).

It’s also interesting to note that my “total” and “non-distracted” were almost entirely under-estimates: probably an error in the measurement of the overall width of the bar chart.

Also note: @brycem had recommended using WebPlotDigitizer for digitizing data from images.

Interactive plot of car crash stats

30 Oct 2014

I spent the afternoon making a D3-based interactive version of the graphs of car crash statistics by state that I’d discussed yesterday: my attempt to improve on the graphs in Mona Chalabi‘s post at 538.

Screen shot of interactive graph of car crash statistics

See it in action here.

Code on github.

Scholarly Publishing Symposium at UW-Madison

30 Oct 2014

At the Scholarly Publishing Symposium at UW-Madison today. Has interesting list of supplemental materials, but apparently only on paper:

Supplemental materials from UW-Madison Scholarly Publishing Symposium

So here they are electronically.

Improved graphs of car crash stats

29 Oct 2014

Last week, Mona Chalabi wrote an interesting post on car crash statistics by state, at fivethirtyeight.com.

I didn’t like the figures so much, though. There were a number of them like this:

chalabi-dearmona-drinking

I’m giving a talk today about data visualization [slides | github], and I thought this would make a good example, so I spent some time creating versions that I like better.
Read the rest of this entry »

Error notifications from R

4 Sep 2014

I’m enthusiastic about having R notify me when my script is done.

But among my early uses of this, my script threw an error, and I never got a text or pushbullet about that. And really, I’m even more interested in being notified about such errors than anything else.

It’s relatively easy to get notified of errors. At the top of your script, include code like options(error = function() { } )

Fill in the function with your notification code. If there’s an error, the error message will be printed and then that function will be called. (And then the script will halt.)

You can use geterrmessage() to grab the error message to include in your notification.

For example, if you want to use RPushbullet for the notification, you could put, at the top of your script, something like this:

options(error = function() { 
                    library(RPushbullet)
                    pbPost("note", "Error", geterrmessage())
                })

Then if the script gives an error, you’ll get a note with title “Error” and with the error message as the body of the note.

Update: I knew I’d heard about this sort of thing somewhere, but I couldn’t remember where. Duh; Rasmus mentioned it on twitter just a couple of days ago! Fortunately, he reminded me of that in the comments below.

Another update: Ian Kyle pointed out in the comments that the above function, if used in a script run with R CMD BATCH, won’t actually halt the script. The simplest solution is to add stop(geterrmessage()), like this:

options(error = function() { 
                    library(RPushbullet)
                    pbPost("note", "Error", geterrmessage())
                    if(!interactive()) stop(geterrmessage())
                })

Notifications from R

3 Sep 2014

You just sent a long R job running. How to know when it’s done? Have it notify you by beeping, sending you a text, or sending you a notification via pushbullet.

Read the rest of this entry »

The mustache photo

28 Aug 2014

A certain photo of me has been following me around for some time.

Karl with a mustache, 15 Nov 2002

The thing is sitting on my website, so I suppose I have only myself to blame. I actually quite like the photo. I look happy. I was happy. I’m not always happy.

Read the rest of this entry »


Follow

Get every new post delivered to your Inbox.

Join 103 other followers