When physicists do mathematics, they don’t say they’re doing “number science”. They’re doing math.
If you’re analyzing data, you’re doing statistics. You can call it data science or informatics or analytics or whatever, but it’s still statistics.
If you say that one kind of data analysis is statistics and another kind is not, you’re not allowing innovation. We need to define the field broadly.
You may not like what some statisticians do. You may feel they don’t share your values. They may embarrass you. But that shouldn’t lead us to abandon the term “statistics”.
Tags: data analysis, data science
5 Apr 2013 at 9:02 am
Yes!
(Ken, statistician and proud of it.)
5 Apr 2013 at 9:51 am
Absolutely.
9 Apr 2013 at 12:56 am
OK but… as you point out, physicists do mathematics, but still call themselves physicists because that’s not all they do.
Whether or not you agree, data scientists would qualify themselves the same way — they do statistics, but also other things that statisticians generally do not do (product development, system administration, engineering, etc.).
9 Apr 2013 at 6:08 am
Yes, I’m not saying that a computer scientist should be called a statistician without his/her approval. (Though I might call a statistician who does only mathematics, a mathematician.)
But there should be no “data science” that is not statistics. “Statistics” should swallow it up.
9 Apr 2013 at 8:28 am
I dunno, I don’t think we get to appropriate the term without adjusting our curriculum a bit so that we teach at least some of the engineering/computer science/business skills needed for many data science jobs.
Data science is also a completely catch-all term that is still very ill-defined. This was an interesting talk about it that I recently saw, sort of an empirical approach to defining data science: http://www.youtube.com/watch?v=aMDe5pODkB0
9 Apr 2013 at 12:07 pm
I was about to say, “Why not?”
But then, I do totally agree that bio/statistics training needs to be modernized to include the skills need for visualizing, managing, and analyzing high-dimensional data, and for writing better software. So some (software) engineering and CS skills should be included. (I’m not sure about “business”, but I’ll let that go.)
If there’s a “data science”, it should be statistics. As needs change, the field should adapt.
9 Apr 2013 at 1:44 pm
Yes agreed completely. The fact that this new term emerged is, I think, at least somewhat a reflection of our field not adapting fast enough to address current problems.
9 Apr 2013 at 7:40 am
“If you’re analyzing data, you’re doing statistics.”
Can I suggest you add several exclamation points to this sentence?
Disclaimer: I’m a physicist, and I work as a data scientist. Yes, I do some things that are not statistics, but they are also not direct related to data analysis – like installing statistical software, coding data interfaces to data sets stored in dozens of different ways, coding some new algorithm, etc. But all of this are just tool to execute my main purpose: analyse data. And when I’m analyzing data, guess what? I’m doing statistics.
2 May 2013 at 3:07 pm
Frankly, I think a more apt analogy is Health Science vs. Medicine.
2 May 2013 at 8:53 pm
You make a good point, and I thank you for mentioning it, because I wouldn’t have thought about it otherwise. But I still think the scope of “statistics” should be expanded to encompass all of “data science.”
16 Feb 2014 at 2:54 am
lol.
9 Dec 2014 at 8:56 am
[…] Related Articles […]
8 Apr 2016 at 10:33 am
[…] years ago this week (5 April 2013), I wrote a blog post, “Data science is statistics”. I was fiercely against the term at that time, as I felt that we already had a data science, and […]
22 Nov 2016 at 2:55 am
[…] Karl Broman. Data science is statistics. https://kbroman.wordpress.com/2013/04/05/data-science-is-statistics/, […]
22 Nov 2016 at 4:36 pm
[…] this vague definition is that experts have variously claimed data science to be Statistics 2.0 [8, 11], Computer Science 2.0 [12] and Business Analytics 2.0 [8]. This is partly because of greater […]
30 Nov 2016 at 10:21 am
[…] vague definition is that experts have variously claimed data science to be Statistics 2.0 [8, 11], Computer Science 2.0 [12] and Business Analytics 2.0 [8]. This is partly because of greater […]
8 May 2017 at 10:56 am
[…] already devoted to the analysis of data — a field called statistics — is alarming. I like what Karl Broman […]
7 Sep 2017 at 4:20 am
[…] Data science is statistics […]
7 Sep 2017 at 4:22 am
[…] Data science is statistics […]