kbroman.org/blog

30 Apr 2017

I’m starting a new blog site at kbroman.org/blog. I’m abandoning this WordPress site as it is; I don’t want to fuss with trying to move things.

I thought maybe I was being paranoid

30 Jan 2017

An odd thing happened to me on twitter today.

UW System had tweeted a statement from system president Ray Cross concerning Trump’s executive order banning refugees and immigrants from 7 countries. Cross’s statement is close to the most empty, useless possible such statement.

So I tweeted back, asking him to revise his statement and actually take a stand against Trump’s executive order.

disappearedtweet

Note that I mentioned not just @RayWCross and @UWSystem, but also @POTUS.

Twenty minutes later, the tweet had gotten a couple of ❤️’s, and then one RT, but for some reason the tweet itself had disappeared from my twitter feed (in the app I use, Tweetbot). I looked at my twitter account in a web browser, and sure enough the tweet wasn’t there.

Puzzled, and knowing I hadn’t deleted the tweet, I poked around and was able to find, somehow, the URL for the original tweet, and I could see the tweet in a browser. So it hadn’t been deleted, it had just been removed from my feed.

@Kerri_Gilbert then wrote that she could see it in her feed but not if she went to my timeline. So I wasn’t completely crazy.

@CFlensburg later wrote that he could see the tweet, but he’s in Australia.

Now my hypothesis was: someone who looks after @potus had somehow suppressed my tweet. Or was I being paranoid?

A quick google search (for “tweet disappeared from feed”) revealed a Washington Post article from 2015-10-30, “Tweets are disappearing on Twitter. Why?” This seems to explain what happened: my tweet was suppressed by some kind of “abuse filter.”

The article is an interesting read. It describes the experience of Paul Dietrich (@Paulmd199), who wrote an analysis of his situation: “Adventures in Twitter Censorship.” Also an interesting read.

And note that the tweet suppression is location-specific. It’s okay for Australians to read the tweet, but Americans need to be protected from it.

But why this tweet?

This tweet of mine was really a pretty bland tweet. I mean, I was directly criticizing my boss‘s boss‘s boss, and we just barely have tenure at Wisconsin anymore, so I certainly wasn’t going to be abusive. I’m not sure why I actually mentioned @potus, but I figured, “What the hell.”

But what sort of abuse detection algorithm would decide that this tweet was abusive? If there’s an abuse filter that would suppress this tweet but not all of the actual abuse rampant on twitter…well that’s bullshit.

Maybe it was that I wrote “Sad.” (I thought that was funny. And hardly abusive.)

I favor the theory that it wasn’t an algorithm but rather a person: that someone is assigned to read all mentions of @potus and if they deem something inappropriate, they flip a switch and the tweet gets suppressed.

So maybe I am paranoid.

(I forgot to mention: @NickFleisher had the best response to Ray Cross’s lame statement: “Revise and resubmit.”)

Halloween 2016 count

31 Oct 2016

Here’s a graph of the numbers of trick-or-treat-ers we saw this evening, by time. 10 of the 25 kids arrived in one big group. (Compare this to our 2011 experience.)

Halloween 2016 count

My JSM 2016 itinerary

27 Jul 2016

The Joint Statistical Meetings are in Chicago next week. I thought I’d write down the set of sessions that I plan to attend. Please let me know if you have further suggestions.

First things first: snacks. Search the program for “spotlight” or “while supplies last” for the free snacks being offered. Or go to the page with the full list.

Read the rest of this entry »

Chris Walker at Faculty Senate

15 Apr 2016

Chris Walker‘s powerful speech at the Faculty Senate on 4 Apr 2016 (see “Hateful shit at UW-Madison”) was recorded!

You must listen to it!

I am a data scientist

8 Apr 2016

Three years ago this week, I wrote a blog post, “Data science is statistics”. I was fiercely against the term at that time, as I felt that we already had a data science, and it was called Statistics.

It was a short post, so I might as well quote the whole thing:

When physicists do mathematics, they don’t say they’re doing “number science”. They’re doing math.

If you’re analyzing data, you’re doing statistics. You can call it data science or informatics or analytics or whatever, but it’s still statistics.

If you say that one kind of data analysis is statistics and another kind is not, you’re not allowing innovation. We need to define the field broadly.

You may not like what some statisticians do. You may feel they don’t share your values. They may embarrass you. But that shouldn’t lead us to abandon the term “statistics”.

I still sort of feel that way, but I must admit that my definition of “statistics” is rather different than most others’ definition. In my view, a good statistician will consider all aspects of the data analysis process:

  • the broader context of a scientific question
  • study design
  • data handling, organization, and integration
  • data cleaning
  • data visualization
  • exploratory data analysis
  • formal inference methods
  • clear communication of results
  • development of useful and trustworthy software tools
  • actually answering real questions

I’m sure I missed some things there, but my main point is that most academic statisticians focus solely on developing “sophisticated” methods for formal inference, and while I agree that that is an important piece, in my experience as an applied statistician, the other aspects are often of vastly greater importance. In many cases, we don’t need to develop sophisticated new methods, and most of my effort is devoted to the other aspects, and these are generally treated as being unworthy of consideration by academic statisticians.

As I wrote in a later post, “Reform academic statistics”, we as a field appear satisfied with

  • Papers that report new methods with no usable software
  • Applications that focus on toy problems
  • Talks that skip the details of the scientific context of a problem
  • Data visualizations that are both ugly and ineffective

Discussions of Data Science generally recognize the full range of activities that are required for the analysis of data, and place greater value on such things as data visualization and software tools which are obviously important but not viewed so by many statisticians.

And so I’ve come to embrace the term Data Science.

Data Science is also a much more straightforward and understandable label for what I do. I don’t think we should need a new term, and I think we should argue against misunderstandings of Statistics rather than slink off to a new “brand”. But in general, when I talk about Data Science, I feel I can better trust that folks will understand that I am talking about the broad set of activities required in good data analysis.

If people ask me what I do, I’ll continue to say that I’m a Statistician, even though I do tend to stumble over the word. But I am also a Data Scientist.

One last thing: I’ve also come to realize that computer science folks working in computational biology are really just like me. They have expertise in a somewhat different set of tools, but then that’s true for pretty much every statistician, too: they’re much like me but they have expertise in a somewhat different set of tools. And it’s nice to be able to say that we’re all data scientists.

It should be recognized, too, that academic computer science suffers from many of the same problems that academic statistics has suffered: an overemphasis on novelty, sophistication, and toy applications, and an under-appreciation for solving real problems, for data visualization, and for useful software tools.

Action items in response to hateful shit

5 Apr 2016

UW-Madison faculty got an email update from Vice Provost and Chief Diversity Officer Patrick Sims regarding the things we can do in response to the hate and bias incidents on campus.

Here are the things he had mentioned yesterday at the Faculty Senate meeting:

  • Address hate/bias incidents in your curriculum to ameliorate unacceptable occurrences in our campus community.
  • Look at “bullying” language as a way to address possible hate/bias incidents in the classroom.
  • Commit to engaging in ongoing cultural competency training. Learning Communities for Institutional Change & Excellence (LCICE) as an infrastructure already provides these services campus-wide.
  • Commit to experiencing the leadership institute and become a facilitator, carving out 10-15% of your time towards these efforts.
  • Support the request for additional staff.
  • Visit the Campus Climate website

An attached letter from the Hate & Bias incident team added:

  • Your school/college/department can host a bystander intervention workshop on hate and bias. This workshop will provide tools for UW-Madison community members on when and how to intervene. If you would like to host a workshop, please contact Joshua Moon Johnson.
  • Many incidents go unreported for a variety of reasons. We encourage students and campus community members to report incidents of hate and bias to ensure that campus can best support the victim and work to prevent future incidents. We encourage you to post the link to report on your school/college/department websites.
  • Oftentimes students do not report incidents because they are unaware of the reporting process. To increase awareness of the reporting process, we encourage you to share brochures and posters with information on how and why it is important to report. These will be distributed across campus in the next few weeks.
  • Students who are victims of hate and bias incidents may need immediate support. Please be sure to refer/provide students with appropriate resources such as mental health/counseling services through University Health Services (UHS). The Multicultural Student Center also has drop-in hours with UHS counselors as well as support and discussions groups for students of color.
  • Many students who are victims of hate and bias incidents identify with an underrepresented racial group, gender identity or sexual orientation, or religious group. We encourage you to specifically reach out to marginalized student groups to raise awareness of the bystander intervention workshop and reporting process.

I got a reasonably positive response to my email to my faculty colleagues suggesting that we all commit to cultural competency training. But the training from the LCICE mentioned above looks to be semester-long, Tuesdays 4:30-7:30pm. I think I’ll have a difficult time convincing my colleagues of that. We need something in between nothing and 45 hours.

Hateful shit at UW-Madison

4 Apr 2016

I’m a privileged white male university professor. As privileged as they come, really. My father was a professor of chemistry; my mother also has an advanced degree in chemistry. The jobs I’ve held have been more about personal fulfillment than money: dancer, dance teacher, secretary for intellectual property lawyers, research and teaching assistant, professor. People assume I know what I’m talking about, even if I’m in shorts and a t-shirt.

All that’s just to say that, when it comes to the ongoing hateful acts that have been happening at the University of Wisconsin-Madison, I’m really the last one that you should be listening to. You should instead listen to UW students, such as the United Council of UW Students, who have submitted a list of 5 reasonable demands, or Vice Provost and Chief Diversity Officer Patrick Sims, who made an important 8-min video in response to a recent hateful incident that you should now go away and watch (really, stop reading what I have to say and spend 8 minutes watching that video), or Chris Walker, Asst Prof in the dance department, who spoke movingly today at the UW-Madison Faculty Senate meeting about the shit that faculty and students of color have to put up with on campus.

Lot’s of crap has been happening in Wisconsin lately. My focus has been on what Scott Walker and company have been doing to the state and to the University of Wisconsin, most recently by making huge cuts to state support to the UW System and by weakening tenure and shared governance.

That’s all been an embarrassment, and depressing, but in comparison to the hateful racist shit that’s been happening on campus, and Vice Provost Sims reported that there have been >30 reported hate or bias incidents on campus this year, tenure and funding just don’t seem that important.

Chris Walker’s speech at the Faculty Senate today really hammered this home. As a black man on campus, he’s experienced a lot of shit: worse shit then we’re seeing in the papers. And if we don’t fix this, our students can’t be successful. We must fix this.

What can a biostatistics professor do? I’m open to suggestions.

But for now, I’ll follow Patrick Sims’s suggestion and start with one of the United Council of UW Students’ demands:

We demand that the University of Wisconsin System creates and enforces comprehensive racial awareness and inclusion curriculum and trainings throughout all 26 UW Institution departments, mandatory for all students, faculty, staff, campus & system administration, and regents. This curriculum and training must be vetted, maintained, and overseen by a board comprised of students, staff, and faculty of color.

I’ve written an email to the faculty in my department, asking that we, as a department, volunteer to participate in such racial awareness training:

email_to_dept

Correction: There’s an error in my email; Chris Walker is Associate Professor, and has been for a couple of years.

Update: Chris Walker’s speech at the 4 Apr 2016 Faculty Senate meeting was recorded! Must listen.

Write unit tests!

7 Dec 2015

Since 2000, I’ve been working on R/qtl, an R package for mapping the genetic loci (called quantitative trait loci, QTL) that contribute to variation in quantitative traits in experimental crosses. The Bioinformatics paper about it is my most cited; also see my 2014 JORS paper, “Fourteen years of R/qtl: Just barely sustainable.”

It’s a bit of a miracle that R/qtl works and gives the right answers, as it includes essentially no formal tests. The only regular tests are that the examples in the help files don’t produce any errors that halt the code.

I’ve recently been working on R/qtl2, a reimplementation of R/qtl to better handle high-dimensional data and more complex crosses, such as Diversity Outbred mice. In doing so, I’m trying to make use of the software engineering principles that I’ve learned over the last 15 years, which pretty much correspond to the ideas in “Best Practices for Scientific Computing” (Greg Wilson et al., PLOS Biology 12(1): e1001745, doi:10.1371/journal.pbio.1001745).

I’m still working on “Make names consistent, distinctive, and meaningful”, but I’m doing pretty well on writing shorter functions with less repeated code, and particularly importantly I’m writing extensive unit tests.
Read the rest of this entry »

Fitting linear mixed models for QTL mapping

24 Nov 2015

Linear mixed models (LMMs) have become widely used for dealing with population structure in human GWAS, and they’re becoming increasing important for QTL mapping in model organisms, particularly for the analysis of advanced intercross lines (AIL), which often exhibit variation in the relationships among individuals.

In my efforts on R/qtl2, a reimplementation R/qtl to better handle high-dimensional data and more complex cross designs, it was clear that I’d need to figure out LMMs. But while papers explaining the fit of LMMs seem quite explicit and clear, I’d never quite turned the corner to actually seeing how I’d implement it. In both reading papers and studying code (e.g., lme4), I’d be going along fine and then get completely lost part-way through.

But I now finally understand LMMs, or at least a particular, simple LMM, and I’ve been able to write an implementation: the R package lmmlite.

It seemed worthwhile to write down some of the details.

Read the rest of this entry »