How Experian is turning big data into big dollars Source: Mike Freeman
At Experian DataLabs, a team of scientists is thwarting bad guys with math.
A top-five U.S. credit card issuer recently dumped about 6 billion transaction records on the San Diego outfit to see if its fancy machine learning mathematical formulas could do a better job of rooting out credit card fraud than the bank's existing system.
Experian scientists used neuro-embedding and natural language processing techniques to understand the "syntax" of the credit card data, said Honghao Shan, a Ph.D. computer scientist.
Transactions with odd syntax - words out of place if you will - popped up as potential fraud.
"We thought we had figured it out and went back to them," said Eric Haller, head of Experian DataLabs. "They said, how did you do that? You identified fraud that we can't identify ourselves. And it turns out we reduced their false positives by half."
Making sense of the tsunami of data from connected devices and other sources is a fast-growing technology field. Predictive analytics - the mathematical formulas that find patterns and draws conclusions from the data - are being applied to applications ranging from smart cities to medicine to cybersecurity.
"We are going now from the monitoring era to really anticipating what is about to happen at that moment, or what will happen in the next hour or two hours so we can plan for it," said Ilkay Altintas, the Ph.D. chief data science officer at the San Diego Supercomputer Center.
San Diego has a small cluster of data analytics firms - particularly in the sub-specialty of searching for credit card fraud.
ID Analytics, FICO, Opera Solutions, Global Analytics and Experian DataLabs are among the area companies applying math to big databases to uncover useful information.
Many of these San Diego analytics firms have their roots in HNC Software, which was purchased by FICO in 2002 for $810 million. Several HNC executives and scientists left after the FICO acquisition to start their own firms.
"There is this network from HNC that is still pervasive," said Haller, who formerly worked as chief marketing officer at HNC. "It is a very tight community. Almost everybody keeps in contact with each other."
Experian, a 17,000 employee global business services firm, is best known as one of the big three consumer credit bureaus. Haller was managing products for its credit services group about six years ago when he pitched the idea of creating an internal research lab focused on analytics.
"We needed to invest in R & D," said Haller. "It's all about the people. I knew the people. I worked with them all down in San Diego."
Experian gave Haller's DataLabs experiment three years. Though it started with a small team, it got off to a hot start. A major credit card firm hired the lab to build an analytics-based cross-sell marketing platform.
That led to other projects. The company built Extended View, which ranked the risk of lending to people with little or no credit history by looking at data ranging from gym memberships to rental history to magazine subscriptions.
It developed an analytics "sand box" where large banks, retailers, mobile network operators and others merge their data - with the identity information removed - with Experian's massive data sets.
At the three-year mark, Experian DataLabs was in the black, said Haller. The company opened additional labs in Brazil and London. Revenue is now in eight figures.
But that still is small for Experian, which generates $5 billion in annual revenue.
"The company said OK, can you turn the corner and make this a needle mover? said Haller. "We want to invest more. What will happen if we double it or quadruple it? That is what we are wrestling through right now."
Experian DataLabs employs 40 data scientists and engineers in San Diego, many of them with Ph.Ds. It recently moved to a larger office, but Haller said it's too early to say how many more workers it may hire.
"The caliber and talent we have pulled in, we think we can go head to head with anybody," said Haller. "That is what the company expects from us - push the envelope around analytics, push the envelope around data."
Sophie Liu, a Ph.D. physicist and computational neuroscientist, did her post-doctorate study at the Salk Institute on how retina cells process information. She moved to FICO before joining Experian DataLabs.
Today, she is working to map mobile phone location data with consumer segmentation data - which lumps consumers into 20 categories such as Healthy Lifestyle, Bar Goers, On the Road, Families and so on.
"If a chain store is trying to find a location, they can learn from our study and try to find the best location that matches their customers and demographics," said Liu.
Mobile is getting a lot of attention these days. Nobody has figured out how to offer instant credit on mobile devices, said Haller.
Experian DataLabs is working on the problem now, particularly convenient identity confirmation "so I don't have to ask you 20 questions to confirm your authenticity," said Haller.
Experian has not been immune to the rash of recent data breaches. In September 2015, the company revealed that hackers had broken into an Experian computer server that contained personal information for about 15 million T-Mobile customers.
The server, however, was part of a different business unit than Experian DataLabs, said company spokesman Michael Troncale.
"The Lab itself never had access to any of that kind of data," said Troncale. "That is not in their purview."
The server contained names, addresses and encrypted social security numbers of T-Mobile customers who had their credit worthiness checked prior to signing up for wireless subscriptions or device financing. Experian and T-Mobile provided free credit monitoring to impacted customers.
While there is more data available than ever, finding ways to make money off it remains a challenge. Most business models focus on better marketing, controlling risk and improving day-to-day operations.
But there are new areas where data analytics is being applied. San Diego's CyberFlow Analytics, which was acquired last month by Webroot, taps predictive algorithms for cybersecurity.
CyberFlow uses machine learning to determine what normal traffic looks like inside a corporate network. It then flags anomalies as potential cyber threats.
"It is adversarial analytics," said Tom Caldwell, co-founder of CyberFlow. "You have these bad guys - even in fraud - who once they think you are catching them they change their tactics. That is the game in adversarial analytics."
According to industry research firm Gartner, 25 percent of large global companies will use big data analytics for security or fraud detection applications this year, up from 8 percent a couple of years ago.
New data sources also have fueled demand for analytics, said Bruce Hansen, a former top executive at both HNC and ID Analytics.
"There is all that social data that didn't exist before. There is all this mobile cellphone data that didn't exist before," he said. "And now because of computing power, everything is just getting faster and faster."
Altintas, the Super Computer Center data scientist, said biometrics and smart cities are areas where data sets are becoming large enough to become useful.
In Southern California's back country, for example, data from more than 170 connected weather stations and cameras is combined with other information to create a Fire Potential Index for specific areas over the upcoming seven days.
San Diego Gas & Electric owns the weather stations, which are connected to a high performance wireless network.
"In the old days, we would have information on fire weather the day before," said Altintas. "It's not enough time for a fire department to plan for staffing for the situation. Now, with what they are getting from SDG&E's fire weather index, they are able to staff" for the fire danger.
| }
|