The Predictive Analytics Team

I’ve been speaking at national TDWI conferences for more than three years. As a part of that responsibility I’ve been interviewed a few times and have also contributed to their blogs and publications. Some of this material is public and some is for TDWI members but I would like to summarize some of this work and let you know where to find it. A recurring theme has been working in teams. The accumulation of material is considerable and taken together addresses the topic well.

When are you ready for a full time Data Scientist

The first is an interview that I did exploring when an organization is ready to hire a full time Data Scientist. I think that some do this too quickly. They do so because their not sure how to start so they figure that the first step is to bring in an expert and go from there. The problem with this approach is that organizations take months to find the “right” hire and then 12 months later they lose them. Turnover is a real problem and folks simply aren’t thinking more than 1 or 2 chess moves out. You have to have a plan. You can’t expect a new hire to show up with one.

I state it this way in the interview:

“Companies seem to think that if they can manage to attract that person with the most experience, the longest resume, the most letters after their name, a Ph.D.-type person — if they just make that hire, everything else will fall into place.

But like most things in business transformation, it’s just not that easy.”

The interview is a considerable length and covers a lot of ground. Is a linked to a webinar as well: “Your First Hire.” In the webinar I cover the composition of the team in considerable detail. So I focused on the different team roles and how many of those roles should be filled with existing internal talent. I argue that teams created out of thin air, composed entirely of new hires is a poor choice. I explain the roles one by one.

Finding and Hiring Data Scientists

In another interview, in the Business Intelligence Journal I tackle the hiring itself. Entitled High Demand Drives Up Interest in Data Scientists—and Salaries we talk about why the market is so hot. As I write this, it was a couple of years ago, but the market hasn’t changed much. Even the numbers haven’t changed much as the market as stabilized a bit.

We really get into the details. For instance I’m asked about salaries.

BI Journal: Speaking of salaries, what should companies anticipate spending for a data scientist? You just mentioned $150,000 as a round number for an experienced person.

Keith: You know, I look into the issue of data scientist salaries about once a year. I’ve read two salary surveys [recently], both by Burtch Works, who did a good job.

Burtch Works continues to be an amazing resources and their latest can be found here. Make sure that you also seek out the webinar recordings that they put out about once a year.

The interviewer and I also get into the issue of Data Science graduate degrees and certificate programs and my discussion of that is still relevant as well. They are new, they are attracting a lot of graduates, but none of them have been around long enough to see what impact they will have on salaries, retention, or performance. Most of us that have been at this for a decade or two have degrees in statistics, computer science, or engineering.

Data Preparation is collaborative too

In another blog post I cover the challenges of collaboration during the data preparation phase. The challenge here is figuring out who should do what and it is not as easy as you might think to sort that out.

Many companies figure that they can just perform the data prep in advance for everyone’s benefit, but that ultimately doesn’t work out because the most interesting data is data that is being accessed or combined in innovative ways that are different from routine reporting requirements.

For example, visits to the company website might not be routinely combined with inbound calls to a customer service center because they might be managed by different departments. This would seem to be naturally handled by IT, but it must be a collaborative effort to be productive.