The Joint Statistical Meetings are in Chicago next week. I thought I’d write down the set of sessions that I plan to attend. Please let me know if you have further suggestions.
Posts Tagged ‘conference’
What do I want in a conference website? Not this.
- I want to be able to browse sessions to find the ones I’m interested in. That means being able to see the session title and time as well as the speakers and talk titles. A super-long web page is perfectly fine.
- If you can’t show me everything at once, at least let me click-to-expand: for the talk titles, and then for the abstracts. Otherwise I have to keep clicking and going back.
- I want to be able to search for people. And if I’m searching for Hao Wu, I don’t want to look at all of the Wus. Or all of the Haos. I just want the Hao Wus. If I can’t search on
"Hao Wu", at least let me search on
- If my search returns nothing and I go back, bring me back to the same search form. Don’t make me have to click “Search for people” again.
- I’d like to be able to form a schedule of the sessions to attend. (JSM2015 does that okay, but it’s not what I’d call “secure” and you have to find the damned things, first.) Really, I want to pick particular talks: this one in that session and that one in the other. But yeah, that seems a bit much to ask.
The JSM 2015 site is so terrible for browsing, I was happy to get the pdf of the program. (Good luck finding it on the website on your own; ASA tweeted the link to me, due to my bitching and moaning.) You can browse the pdf. That’s the way I ended up finding the sessions I wanted to attend. It also had an ad for the JSM 2015 mobile app. Did you know there was one? Good luck finding a link to that on their website, either.
The pdf is useable, but much like the website, it fails to make use of the medium. I want:
- Bookmarks. I want to jump to where Monday’s sessions start without have to flip through the whole thing.
- Hyperlinks. If you don’t include the abstracts, with links from the talk titles to the abstracts, at least include links to the web page that has the abstract so I don’t have to search on the web.
- More hyperlinks. The pdf has an index, with people and page numbers. Why not link those page numbers to the corresponding page?
I helped organize a small meeting in 2013. The program on the web and the corresponding pdf illustrate much of what I want. (No scheduling feature, but that meeting had no simultaneous sessions.) I included gratuitous network graphs of the authors and abstracts. It’s 2015. No conference site is truly complete without interactive network graphs.
As Thomas Lumley commented below, if you search on “Wu” you get all of the “Wu”s but also there’s one “Wulfhorst”. And if you search on “Hao” you get only people whose last name is “Hao”.
He further pointed out that if you search for the affiliation “Auckland” the results don’t include “University of Auckland” but only “Auckland University of Technology”. And actually, if you search for “University of Auckland” you get nothing. You need to search for “The University of Auckland”.
The Joint Statistical Meetings (JSM) are the big statistics meetings in North America, “joint” among the American Statistical Association, Institute of Mathematical Statistics, International Biometric Society (ENAR and WNAR), Statistical Society of Canada, and others.
I’m at the 2015 AAAS meeting in San Jose, California. This is definitely not my typical meeting: too big, too broad, and I hardly know anyone here. But here’s a quick (ha ha; yah sure) summary of the meeting so far.
Gerald Fink gave the President’s Address last night. He’s the AAAS President, so I guess that’s appropriate. But after five minutes of really lame simplistic crap (for example, he said something like, “A single picture can destroy our known understanding of the universe,” like innovation and improving our understanding is a bad thing), I left.
Oh, and before that: the emcee of the evening, who introduced Janet Napolitano, totally couldn’t pronounce her last name. (Her remarks, particularly her comments in support of public universities, were quite powerful.) Old dude: practice such things! Your ineptness reveals that you haven’t paid proper attention to her.
A huge meeting, but I know next to no one here. But I ran into Sanjay Shete in the exhibit hall, where I attempted to get two of every tchotchke. (My kids will pitch a fit if one gets something, no matter how lame the thing, and the other doesn’t.) Sanjay was named AAAS Fellow, that’s why he’s here.
I went to a dozen talks. A half-dozen I really liked.
Alan Aspuru-Guzik talked about how to find (and visualize) useful organic molecules among the 1060 (or 10180?) possible. Cool high-throughput computing and interactive graphics to produce better solar panels (particularly for developing countries) and huge batteries to store wind- and solar-based power.
Russ Altman talked about how to search databases, web-search histories, and social media, to identify pairs of drugs that, together, give bad (or good) side effects that wouldn’t be predicted from their on-their-own side effects.
David Altshuler had a hilarious outline slide for his talk, but the rest was really awesome. A key point: to develop precision medicine will require hard work and there’s no magic bullet. And basic (not just translational) research is critical: we can’t make a medicine that gets to the precise cause (and that’s what precision medicine is about) if we don’t understand that basic biology.
I gave a talk myself, in a session on visualization of biomedical data, but it was definitely not the best talk in the session, nor the second best. Mine might have been the worst of the five talks in the session. But that’s okay; I think I did fine. It’s just that Sean Hanlon (brother of my UW–Madison colleague, Bret Hanlon) put together a superb, but thinly-attended, session.
Miriah Meyer’s was my favorite talk of the day. She develops visualization tools to help scientists make sense of their data. And her approach is much like mine: specific solutions to specific data and questions. She talked about MulteeSum, PathLine, and MizBee. Favorite quote: “It’s amazing how much people like circles these days.”
Frederick Streitz from Lawrence Livermore National Lab talked about simulating and visualizing the electrophysiology of the human heart at super-high resolution using a frigging huge cluster, with 1.5 million cores. I loved his analogies: if you are painting your house, having a friend or two over to help will reduce the time by the expected factor, but having 1000 friends or 100k friends to help? In parallel computing, you need to rethink what you’ll use the computers for.
His second analogy: The DOE cluster at Livermore is 100k times a desktop computer. That’s like the difference between PacMan (1980, 2.1 megaFLOPS) to Assassin’s Creed (2011, 260 GigaFLOPS). And their cluster is 100k times that.
At the end of the day, Daphne Koller talked about Coursera. She’s awesome; Coursera’s awesome; I’m a crappy teacher. That’s my thinking at the moment, anyway. (A video of her talk is online. Have I mentioned how much I hate it when people screw up the aspect ratio? It seems like they screwed up the aspect ratio.) University faculty exist to help people, and with Coursera and other MOOCs, we can help a lot of people. Key lessons: the value of peer grading (for learning), not being constrained by the classroom or the 60-min format, ability to explore possible teaching innovations, and just having a hugely broad reach.
College is a place where a professor’s lecture notes go straight to the students’ lecture notes, without passing through the brains of either.
Boy, am I old
I seem to be staying at the same hotel as the American Junior Academy of Sciences (AJAS). Are these high school or college students? Man, do I feel old.
My contribution to education, today: if all of the elevators going down are too packed to accept passengers, press the up button and ride it up and then down. (Later I learned, from one of the AJAS youth, that the “alarm will sound” sign at the bottom of the stairs is a lie. You can take the stairs.)
Dirk Eddelbuettel on Rcpp
Dirk Eddelbuettel gave a keynote on Rcpp [slides]. The goal of Rcpp is to have “the speed of C++ with the ease and clarity of R.” He gave a series of examples that left me (who still uses
.C() to access C code) thinking, “Holy crap this is so much easier than what I do!”
Dirk ended with a detailed discussion of Docker: a system for virtual machines as portable containers. I didn’t fully appreciate this part, but according to Dirk, Docker “changes how we build and test R….It’s like pushing to GitHub.”
After Dirk’s talk was the Sponsor’s Talk. But if I’m going to skip a session (and I strongly recommend that you skip at least some sessions at any conference), anything called ”Sponsor’s Talk“ is going to be high on my list to skip.
Lunch at Venice Beach
R and reproducibility
For your R-based project to be reproducible, the many packages that you’ve used need to be available. And future versions of those packages may not work the same way, so ideally you should keep copies of the particular versions that you used.
David Smith spoke about the R reproducibility toolkit (RRT). The focus was more on business analytics, and the need to maintain a group of versioned packages that are known to work together. CRAN runs checks on all packages so that they’re all known to work together. As I understand it, RRT manages snapshots of sets of packages from CRAN.
I’ve not thought much about this issue. packrat seems the best fit for my sort of work. I should start using it.
The second poster session was in a different location with more space. It was still a bit cramped, being in a hallway, but it was way better than the first day. There were a number of interesting posters, including Hilary’s on testdat, for testing CSV files; Sandy’s on using Shiny apps for teaching; and Mine Çetinkaya-Rundel and Andrew Bray’s poster on “Teaching data analysis in R through the lens of reproducibility“ [pdf].
Met more folks
The main purpose of conferences is to meet people. I was glad to be able to chat with Dirk Eddelbuettel, Ramnath Vaidyanathan, and also Tim Triche. Also karaoke with Sandy, Karthik, Hilary, Rasmus, and Romain.
Wish I’d seen
I had a bit of a late night on Wednesday night, and then I was in a hurry to get down (via public transit!) to the Science Center to meet up with my family. So I’m sorry that I didn’t get to see Yihui Xie‘s talk on Knitr Ninja.
Looking back through the program, there are a number of other talks I wish I’d seen:
- Jan de Leeuw on the Journal of Statistical Software [slides]
- Romain Francois on Rcpp11 [slides]
- Rasmus Bååth on Bayesian First Aid [slides]
- Jeroen Ooms on OpenCPU [slides]
- Nicholas Reich on statsTeachR.org
- Amelia McNamara on Teaching R to high school students (and teachers)
- Jeff Allen on the latest with R Markdown [slides]
In my comments below, I give short shrift to some speakers (largely by not having attended their talks), and I’m critical in some places about the conference organization. Having co-organized a small conference last year, I appreciate the difficulties. I think the organizers of this meeting have done a great job, but there are some ways it which it might have been better (e.g., no tiny rooms, a better time slot for the posters, and more space for the posters).
The Morgridge Institute for Research (MIR), a private research institute associated with UW-Madison, is looking to hire some computational folks working in biology. One position is joint with my department, Biostatistics & Medical Informatics.
Yet another symposium
Yesterday afternoon, they held a symposium on “Computation in Biology” (Here’s the agenda.) Great speakers: Marc Suchard, Brian Shoichet, David Page, and Winston Hide. They were asked to speak broadly about computation in biology and on the key issues for the future, and there was plenty of time for discussion.
I’m not sure what MIR was hoping to get out of the symposium, but if they were looking for guidance regarding their hiring efforts, it wasn’t effective. At the beginning, the discussion was quite heated but not terribly constructive. In the middle, it became more like the usual sort of question/answer after a seminar. I must admit I didn’t stay to the end. Perhaps some important insights were gained after I left. But it seems unlikely that the symposium provided much guidance about hiring in computational biology.
The key issues
Here are what I think the key issues are, when a scientific organization is looking to hire some computational folks.
- Do you want a targeted search (for particular kinds of applications or approaches), or do you want to leave it general and just pick the best folks who apply?. (In my experience, targeted searches don’t work as well; you miss out on some great people.)
- What service role is expected of the person? Do you want them to meet some particular scientists’ needs, or are you going to let them do whatever they want, with the hope that they form useful (to the organization) collaborations? (The expectations should be made explicit.)
- At promotion time, who is going to evaluate the person? If it’s a separate academic department (say, Statistics, or maybe Biochemistry), do they share your organization’s values? (Some of the best applied computational scientists may not be generating the NIH grants that a traditional biological sciences department might expect, nor the JASA or Biometrics papers that a Statistics department might expect.)
- Will the person have appropriate mentors? Is there anyone at the institution who really understands the nature of their position and what they need to do to succeed? (I worry particularly about computational scientists who are isolated from their computational peers.)
I don’t have any great answers, but those are the issues that I’m concerned about.
Update: The real point I’m trying to make: academics doesn’t do a good job of rewarding tool building (eg, software tools). It’s often viewed as better to write papers with toy implementations than to make generally useful software. That’s a real problem for computational biology, and for folks seeking to hire truly useful computational biologists.
I had a great time, but I did come to the strong realization that what I view as important is distinctly different from what the typical ENAR attendee views as important. (Rafa said, incredulously, “You knew that already!”)
Let me tell you about the high- and lowlights, for me.
The annual ENAR meeting (eastern North America portion of International Biometric Society) was a few weeks ago in Washington, DC. It was great to see old friends, and I learned a number of things outside of the sessions, but mostly I was annoyed by the meeting.
Ways to annoy me
- Distorted aspect ratios: I’m often seeing LCD projectors set up in wide format, stretching a presentation that was developed in the old 4:3 style. Is it not obvious that preserving the aspect ratio is more important than filling the screen?
- Outlines for talks (especially 15 min talks): If you have only 15 min, don’t waste any time telling us what you’re going to be telling us; just tell us. And even if you’ve got 45 min, I find the “Background, Methods, Simulations, Application, Discussion” outline totally useless, and anything more complicated than that seldom makes sense until the speaker is part way through the talk. Why try to explain terminology before you’ve gotten to the background section?
- “I’m running out of time so I’m going to skip the real data analysis and go quickly through some asymptotic results.” Someone actually said that.
- Opening night poster session (and for three hours!): It actually seemed to be working, but I sure wouldn’t want to be standing next to a poster from 8-11pm. I would prefer:
- posters available to look at throughout the meeting
- multiple poster sessions (so presenters have some opportunity to talk to each other)
- nothing happening at the time of a poster session
- Dull talks by famous people: (I’m not talking about you, fine reader, but the other famous people.) Biology meetings will have a few invited speakers but the bulk of the talks will be chosen based on submitted abstracts. Statistics meetings seem more often arranged to have people submit proposals for full sessions, with a slate of pre-selected speakers, and it is those sessions that are reviewed. That seems the perfect design if you want crappy talks by famous people. Perhaps I’m annoyed only because I’m a non-famous person who can give a good talk.
I was at the useR! Conference at The University of Warwick in Coventry, UK, last week. My goal in going was to learn the latest things regarding (simple) dynamic graphics, (simple) web-based apps, parallel computing, and memory management (dealing with big data sets). I got just what I was hoping for and more. There are a lot of useful tools available that I want to adopt. I’ll summarize the high points below, with the particular areas of interest to me covered more exhaustively than just “highlights”.
I left feeling that my programming skills are crap. My biggest failing is in not making sufficient use of others’ packages, but rather just building what I need from scratch (with great effort) and skipping dynamic graphics completely.
There were 440 participants from 41 countries (342 Europe; 60 North America).
- There are now >3000 packages on CRAN, with 110 submissions per week (of which 80 are successful), basically all handled by Kurt Hornik.
- CRAN will throw out binaries of packages that are more than two years old.
- What’s within the base of R will shrink rather than grow.
- There have been a lot of improvements in the rendering of graphics.
- R is heavily dependent on a small number of altruistic developers, many of whom feel their contributions are not treated with respect.
library()is to be replaced by
- There will soon be a
parallelpackage for parallel computing.