Still Grappling With Data Security

Today I was going through airport security with my wife.  I got randomly selected for a screening, which consisted of wiping my hands with a cottonish fabric and sending it through the scanner that detects explosives or something like that.  After the screening, I commented to my wife, “so don’t all the terrorists know to not go to the gun range or handle their explosives within 24 hours of going to the airport?  It seems to me that this particular screen is really not a deterrent.  Any half intelligent terrorist worth their salt has got to have investigated TSA, right?  ((if I end up on some FBI watch list for this post, I’ll be both highly amused and highly irritated at the same time))

I’ve been trying to figure this out for ages.  You see, the problem is that even if you have stricter limits on access to fields and tables in your security setup, even if you limit the number of users to sensitive information, you should not assume that your data is any more secure from unauthorized sources.  All you have done is make it harder to access.  Now, I’m not saying that making it harder to access is not a worthwhile exercise.  It is.  But let’s be honest with ourselves.  Harder was not the goal.  Impossible was.

Pretty much every reporting engine in the world allows you or the user to somehow download the data.  Before we lay blame on the vendors, let’s realize that it’s our own fault – we placed it as a requirement in every single RFP, or we “ooh’d” and “aah’d” when they demo’d how easy it was to download to MS Excel.  Either way, we lose all control over data security once data is downloaded by the user.  Privacy controls are voided, confidentiality issues arise, and we have no idea where the data ends up.  Not that this is all our fault either.  People who have security access to compensation data for example should know better than to email that stuff around.

There are a couple of nice solutions though, but I’m not sure how perfect anything is since at some point most of our organizations need to have data stored or downloaded.  We could of course disable downloading, and every manager, finance person and HR practitioner would just have to pull up a dashboard and view the data in real time.  Right…  At the same time, I’ve been advocating that all HR decisions are based in facts and data, and I can envision a world where meetings get really dull when we gather executives around the table but were not able to prepare decks full of analytics beforehand.

Here are a few things you can do to improve your reporting data security:

  • Make sure managers are certified and trained regarding their data responsibilities when they become managers and every year.
  • Review your security access periodically to make sure sensitive data is being accessed by the right roles – some roles may no longer need the permissions over time.
  • Build a prominent warning at the top of reports when data is loaded to ensure that dissemination of sensitive data is a breach of security.
  • Scrub your reports frequently – you may find old reports that are run with sensitive data that is not necessary based on the purpose of the report.

This is just one of those problems I keep grappling with.  We keep giving managers and non-HR functions access to more data – I do believe the business requires it.  We want everyone to be able to make decisions in real time, but we don’t trust our partners fully either.  I’m also completely uncomfortable giving up and going with the idea that some data is just going to slip through or saying that it’s just a change management problem.  Anyone have any thoughts about what they have done?  Please ping me.

Commonizing Meaning

I have some favorite phrases that I’ve been picking up for years.

  • “Eh, voila!” universal for “eh, voila!”
  • “Ah, asodeska” Japanese for “I understand”  (sp?)
  • “Bo ko dien” is Taiwanese for highly unlikely or that’s ridiculous. (sp?)
  • “Oh shiitake” (shitzu is also appropriate), is an imperfectly polite way of saying “oh &#!+”

Basically, these are phrases that i love, but at least the latter two are meaningless to most people i say them to. I could of course go to Japan and most people will know what I’m talking about when i tell them I understand them, but they will then look at me funny when i exclaim in the name of a mushroom in anger.

We face the same problems when we talk about data calculations in HR. The most common of which is the simple headcount calculation. “Simple?” you ask. I mean, how hard can it be to count a bunch of head that are working in the organization on any particular day, right? The smart data guys out there are scoffing at me at this very moment.

First, we put on the finance hat. Exactly how many heads is a part time person? HR exclaims that is why we have headcount versus FTE. But finance does not really care, and they are going to run a headcount using a fraction either way.

Second, we put on our function and division hat. Every division seems to want to run the calc in a different way. And then there are realistic considerations to be made, such as the one country out there that outsources payroll, and does not have a field to differentiate a PT versus FT person. or the country that has a mess of contractors on payroll, and can’t sort them out.

Then you put on the analytics hat, and realize that when you integrated everything into your hypothetical data warehouse, the definitions for other fields have not been standardized around the organization, and you can’t get good head counts of specific populations like managers, executives, and diversity. I mean, is a someone in management a director and above? Or is she jut a people manager? How many people does she have to manage to be in management? Are we diverse as an organization simply because we have a headcount that says we are more than 50% people of color even though 2000 of those people are in Japan where the population is so homogenous that any talk of non Japanese minorities is simply silly?

Then you put on your math hat and some statistician in the organization tells you that you can’t average an average, or some nonsense like that.

So the Board of Directors comes to HR and asks what the headcount of the organization is. You tell them that you have 100,000 employees, plus or minus 10%. Yep, that’s going to go over really well.

I’m not saying its an easy discussion, but all it really takes is getting everyone into the same room one (OK, maybe over the course of a couple of weeks) to get this figured out. I’ve rarely seen an organization that is so vested in their own headcount method that they can’t see the benefits of a standardized calculation. I fact, most of the segments within are usually clamoring for this and we just have not gotten around to it yet, or we think they are resistant. In the end, it’s really not so hard, and we should just get to it.


How to Read a Newspaper

So, when I get on a plane, I often have a newspaper with me.  Whether you are on a plane, train, or anywhere with close quarters with other people, there is a bit of etiquette involved, and a standard trick that frequent travellers are supposed to know about.  Adherence with this trick is unfortunately minimal however.  The trick is as follows:  Take the paper as it was delivered, and unfold just the middle crease without opening the paper – you have only page 1 in front of you.  Fold the paper in half lengthwise and backwards, you should be able to see the left half of page 1.  Using this fold down the middle of the paper, you can read the entire paper without ever bothering the people sitting next to you.

When it comes to data, keeping everything in it’s place and not dispersing data into unwelcome areas is paramount.  HR data is probably the most sensitive data in the organization.  I’m not saying that other data that may contain trade secrets is not equally important, but HR knows stuff about our employees that they really don’t want released.  While openness about jobs and salaries has seemed to increase with the younger generations, there is still a great deal of sensitivity around many issues, and certainly a large amount of data that must be protected from a compliance perspective (such as diversity information and ER claims).  While we have tried to segregate data in such a way that prevents unauthorized access into the database, security and access rights to the systems of record is only the tip of the iceberg when it comes to unraveling the solution to this problem.  Like an email, once a report is generated or an interface is created, the owner of data simply loses control and can’t really ever be sure where that data is going to land.

There really aren’t any good solutions at this time.  You can restrict data so that it does not land in a data warehouse, or prevent integration to other systems, but at some point, there will be a hardcopy report floating on a desk, waiting to be whisked off by the wrong person’s hands.  I’m not really an advocate of putting huge amounts of controls on data.  I think that you appoint a system of record, data owners, access rights, and do your best in a well managed data environment.  I am curious about what others are doing out there to prevent unauthorized or unplanned dissemination of sensitive data other than simple data governance and data management measures.  Is there anything out there that can handle this yet?

HRs Correlation to Business

When we talk about the impact of HR activities on our business’s operational production, we don’t usually think that there is a direct correlation.  In fact, some of our activities probably do have a relatively high correlation effect on business outcomes that we might be surprised about.  In defining correlation, we usually think about it on a –1 to +1 scale, with –1 being negatively correlated, 0 being no correlation, and 1 being positively correlated.  From an HR point of view, if we were able to show that there is a positive correlation from our activities to the business outcomes, that would be a pretty big win.

Personally, I don’t have any metrics since I don’t work in your organizations with your data.  However, with modern business intelligence tools and statistical analysis, it’s certainly possible to discover how our HR activities are impacting business outcomes on a day to day basis.

Take a couple examples.  We know that things like high employee engagement leads to increased productivity, but we don’t always have great metrics around it.  Sure, we can go to some industry survey that points to a #% increase for every point that the engagement surveys go up, but that is an industry survey, not our own numbers.  Especially in larger organizations, we should be ale to continue this analysis and localize it to our own companies.  Similarly, we should be able to link succession planning efforts to actual mobility to actual results.  Hopefully we’d be showing that our efforts in promoting executives internally is resulting in better business leadership, but if we showed a negative correlation here, that means that our development activities are lagging the marketplace and we might be better served getting execs from the external market while we redefine our executive development programs.

I’ll take a more concrete example.  Lets say we’re trying to measure manager productivity.  We might simplify an equation that looks something like this:

Manager Unit Productivity = High Talent Development Activity / (Low Recruiting Activity + Low Administrative Burden)

If this is true, we should be able to show a correlation between the amount of time a manager spends on development activities with her employees to increased productivity over time.  Also expressed in the equation, recruiting activity should also be negatively correlated to the manager’s team performance.  If the manager is spending less time recruiting, that means she is keeping employees longer, and spending more time developing those employees – therefore any time spent recruiting is bad for productivity.

I’m not saying that any of these things are the right measures or the right equations.  What I am saying is that we now have the tools to prove our impact on business outcomes, and we should not be wasting these analytical resources on the same old metrics and the newfangled dashboards.  Instead, we should be investing in real business intelligence, proving our case and our value, and understanding what we can do better.

Understanding HR Data Governance

We bought my wife a new bike this weekend.  She is not usually a bike rider, and the last time I took her out riding with me, she swore never to go riding with me ever again.  I suppose that one does not necessarily realize this if one does not have sufficient self awareness or awareness of the surroundings when in college, but it’s been about 15 years since my last bike ride with her.  This time around, rather than trying to take her out on (what I guess were) fairly advanced mountain biking trails, we decided that we would buy her a bike for cruising around town – a bike to have fun on, but certainly not to go fast on.  Today was my first bike ride with her, and we decided to head out to San Francisco’s new Chinatown over on Clement street to buy some dim sum.  We then proceeded to the Golden Gate Park where we sat by Stowe Lake eating. 

I know that we’re all sick of talking about Data Governance, but the reality is that most of us still have not implemented it.  In fact, I’m going to go as far as to suggest that most of us don’t really know what it is, even though we execute forms of data governance in our everyday lives.

Data Governance basically consists of three things:

  1. How we make decisions about data,
  2. How we define the data,
  3. How we execute processes that involve data.

To start, I really needed to make a decision about riding a bike around the town with my wife.  While I might enjoy the fast and aggressive weekend rides, spending time with my wife in a way that we both would enjoy was paramount.  The next step was to define the type of bike we’d get her – in this case, not a race worthy mountain or road bike, but a street type of a bike.  Finally, we needed to head out and ride at a pace and in a style that would keep her interested, in this case, food and the park.

The HR data governance structure begins with an organizational structure that allows us to make good decisions.  Usually this is a committee or set of groups that escalates what the needs are and how to deal with them. 

We then need to define the data.  Thinking that as we reach across multiple HR systems in a variety of global countries and regions, that we can easily define data might be naive.  Each system has their own definition and countries have widely varying approaches to data as well.  Without a common understanding, it’s next to impossible to have a resultant set of data outputs and outcomes that is reliable.

Lastly, we need to reformulate all of our data processes in a way that is consistent with our data definitions and maintains our quality standards.  Data governance and the definitional process is a predecessor to any HR process, but without HR process, the purpose of data governance (data quality and access controls) is a promise that cannot be kept.

At the end of the day, HR data is an enabler, and we have all experienced HR data that is so messy that it no longer enables anything.  Data Governance is the solution to this problem, but it comes with multiple components, each of which must be implemented for an overall governance program to have any use.

Everything in its Place

I write this in the usual place – the airplane.  I’m in a window seat, so I’m only surrounded on three sides.  The guy in front of me has decided that all the stuff he does not want should not go in front of him, but that “out of sight, out of mind” is a good solution as he shoves the stuff under his seat at my legs.  I was about to start throwing stuff back at him over his head, but thought better of it.  The guy behind be decided the armrest was actually his footrest.  When his shoe was on my arm, he didn’t even have the sense to pull back a bit.  I had to push his foot off the thing.  The guy next to me is a good guy – he just has wide shoulders.  I can’t really blame anyone for that except his parents.  Worst thing yet, someone on this plane is flatulent.  So not only is my space being intruded upon on all three sides, my very airspace is also becoming a bit offensive.  (yes, I am the guy on the plane that will shout “No Farting!!” to everyone)

I was recently talking to someone about how Spain and Mexico uses 2 last names for employees.    I don’t know if you watch tennis, but the only example I can think of is Aranxa Sanchez-Vicario (I think that’s how you spell it).Apparently PeopleSoft only has a 30 character last name field.  I’m not sure why they have not expanded this yet, but honestly, 30 should be enough, especially if you bought the right country packs.  Anyway, I was recently talking to someone about Mexican last names and the thinking was that they would put the mother’s last name in the middle name field, and the father’s last name in the last name field.

Fundamentally, I don’t usually have a problem with workarounds.  But as you know, I’m a data governance guy and I have a huge problem dumping a last name somewhere other than the last name field and taking the middle name field and dropping in a name that is not the middle name.  Not only are there just issues about using fields for the wrong purpose, but there are practical issues around interfaces and analytics.  How do you search if you don’t have a single last name field?  If you have an interface, do you write it with specific instructions to look for Spaniards and Mexicans and re-arrange the names?

There is a point to order and a point to “everything in its place.”  Especially in terms of systems and data governance order is of utmost importance.  You start playing with order, and you wind up with what we call (in technical terms) “crappy data on a massive scale.”  You see, you mess with the wrong workarounds now, and there’s a pretty high probability that you’re going to pay for it later.  Later might be a couple of years.    Before you do the wrong workaround, do the right resource, figure out the implications, and then do it the right way anyway.

Style and Strategy: Nadal versus Federer

I love tennis.  I prefer to watch it than to play as I’m not really very good, and I seem to have a bum shoulder.  Each year, my wife and I record the coverage of every grand slam, and to be honest, we have missed the once omni-presence of Rafael Nadal in all the finals.  Watching Nadal play was always a demonstration of power, command and grace.  The man can move around a tennis court like almost nobody else and only in his early 20’s I think he has 7 or so grand slams in his trophy case.

I also love Roger Federer.  Comparing his gracefulness against Nadal’s is difficult.  Certainly Roger has a different type of grace.  His is flowing and nuanced.  He commands his opponents not that he can always out power them or even out play them, but he can always outsmart them.  You will sometimes (not often) see him losing the first set of the match, only to “figure out” his opponent in the next set.  The key here is style.  They are both winners, but Federer’s graceful fluidity has gotten him a record number of grand slam wins.  On the other hand, Nadal’s speed and hard hitting style have gotten him fewer wins (although he’s quite young) but he is now riddled with constant injuries.

I’m going to liken this to long term HR technology planning.  I’m constantly surprised how few organizations have a long term technology roadmap and have actually stuck to it.  Most, I find, are constantly redeveloping their technology suites not based on their strategic needs, but they are updating with the technology-du jour.  Many organizations get caught implementing too many things all at the same time, and tax their staffs unnecessarily.  They bring new functionality in that was not planned well and integration suffers.

While short term implementation cycles might be executable, the long term data strategy is often what is really at risk.  When implementing off of the roadmap and going for the short term gain, the ability to make high quality connections between systems, the ability to seamlessly bring in the new functionality to the employee experience, and the ability to govern high quality data all become more difficult.  An organization’s ability to plan and stick to the plan is usually also indicative to execute a single technology within the broader HR ecosystem.

When integrations and data governance fail, the effects are often not visible for months or even years down the road.  You can hammer something into place in your HR ecosystem, but the positive impacts of the new technology are often short lived.

Callout to Karen Beaman – who I still have not played tennis with yet.

Business Intelligence and Data Dispersion

Data Encryption with business intelligence and reports has always been a problem.  Users are constantly requesting reports, and once data is in someone’s hands, it’s almost impossible to control data dissemination and what I’ll call data diaspora.  One must admit, especially in large organizations, that trying to put controls from a procedural perspective is not particularly realistic.  With hundreds or thousands of managers out there, controlling the actions of each person is particularly difficult.

In the good old days of ad hoc report files, excel spreadsheets, and powerpoints, any person who got their hands on data could easily forward it to someone else.  The fact is that technology was sufficiently difficult to use that most organizations, even the very large ones, have used Excel as the easiest way to aggregate and analyze data from multiple sources.  Even for single source reports, excel has long been the easiest way to communicate a data set.  Managers didn’t really have robust capabilities to tap into reports on their own, and even then, one of the selling points from software vendors has been the ability to export data into excel where managers or practitioners could continue analysis.

HR technologists have been talking about dashboards and business intelligence for years, but it does seem that the lately emergent technologies are finding some adoption in larger organizations.  Perhaps this is just maturity of the technology, perhaps the prices have started coming down from the fully customized ERP BI software to more vanilla and off the shelf analytics tools, or perhaps it’s possible that spending was just down so far in the last 2 years that nobody was buying the stuff.  Whatever the reasons, the technology and the market seems to be ready now.

Certainly, increased controls are now much more prevalent with each manager going to their own dashboard to view data, and with the large number of analytics available in the HR and talent realms, ad hoc requests are hopefully going down.  All this just means that if you can deliver a set of analytics to the manager desktop as opposed to frequent ad hoc requests, your data is controlled by the application security layer upon delivery.  Since you have never sent an email with an excel spreadsheet, there is no data to be fowarded.

You’ll argue with me that this technology has been around for years upon years – at least a decade.  I’ll absolutely agree that this is true, but I’m pretty sure that every single vendor out there (whether publicly or not) will agree with me that until recently the delivered reports were not sufficiently robust or comprehensive.  ERP vendors are now also delivering robust prebuilt analytics with sufficient drill downs and drill throughs.  The goal of the whole thing is to have enough data presented in a simple but detailed enough manner to eliminate most ad hoc needs.  If you can create an environment that does this, you utilize your application’s security as opposed to releasing your data to the winds of fate.

System of Record: Everything in its Place

I’m sitting on a plane (delayed of course for 4 hours) thinking about the people around me.  I have the fabulous exit row seat on the A319 where there is no seat in front of me.  The guy in the middle next to me is great.  He’s not a talker, he’s slim, does not intrude on my space at all, and basically minds his own business and his own space.  (My policy on planes is that the guy in the middle seat gets both armrests unless s/he happens to be rude, in which case any “nice” policies go out the window.)  The guy sort of in front of me has decided that since there is no seat in front of me, that he will use my foot space as his trash bin.  He’s basically been dropping his garbage literally on top of my feet for the last couple hours.  I basically kick it back at him at which point he turns around and gives me a nasty glare.  I don’t know why, but I really like order.  Things should go in their appropriate place.  When things go elsewhere where they don’t belong, problems seem to start.

For some reason, this has be thinking about systems of record and why this is such a hard thing to implement well.  There seem to be lots of battles around system of record.  Should your employee address reside in your HR or payroll system?  Assuming they are actually different systems, some people will argue that all core employee indicative data resides in the HR system as the primary and gets interfaced to payroll and everywhere else.  In general however, if the address is not current in the HR system, the ramifications are relatively minor.  In the payroll system, local taxes can go awry, garnishments are not paid or are calculated incorrectly, and year end tax statement go to the wrong place.  Then there is the never ending argument that comes from Payroll departments.  HR just does not care as much about these things.  Let’s say things are still entered manually (god forbid).  HR departments might sit on an address change for a while, but Payroll departments are all over it.

I also think about competencies.  Do competencies belong with job data in core HR? or do they sit better with all the talent stuff in a talent system?  Wait, wait, you have multiple talent systems?  Which talent system?  Are the competencies designated with the job analysis?  And do we care where the competencies are designed if they are only utilized at the talent process level?

It really comes down to data governance (do we hate data governance yet?  We should, but we don’t because not enough of us are doing it well yet).  I was recently speaking to an organization who decided that the global employee addresses were owned by the legal department in the organization.  They decided it was not HR or Payroll simply because there were enough compliance issues from global safe harbors to payroll compliance and data privacy that it could only be owned by legal.  In turn, it would then be legal’s right to decide where the system of record would be.  When it comes down to competencies, who owns this thing?  Is it compensation?  More often than not it’s talent, but this is indeed one of those data elements that get defined in such a cross functional way that it’s hard to navigate the waters.

The hope is that with the continued evolution of real time API’s and middleware, integration of data elements keeps getting easier and the conflicts that arise due to systems of record ease.