Skip to content

Our Privacy Statement & Cookie Policy

All Thomson Reuters websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.

AI Experts

SoFi’s data science head: Opening the funnel to non-traditional borrowers with machine learning

Yan Wu of SoFi proposes that lenders can expand their business, thanks to the use of artificial intelligence and machine learning.

Founded just seven years ago by four Stanford graduate students, Social Finance (more commonly known as SoFi) is an online personal finance company, with lending, wealth management and deposit account products. The company has enjoyed impressive growth through an online-only approach, with over $30 billion in funded loans and over 500,000 members on its platform.

From his vantage point within this digitally empowered finance company, Yan Wu (SoFi’s Head of Analytics and Data Science) shared his insights with Thomson Reuters on a variety of topics. He touched on how machine learning can assist in lending decisions, the impact of the cloud on data collaboration and AI development, and how AI can help deliver personalized service to wealth management clients.

If we’re able to tailor an experience to a particular person based on what data we have on them and how we’re able to infer information from that data, the better it gives people an experience that’s personable. That’s where you’re going to see a non-linear growth in terms of your addressable populations and, therefore, your revenue. Almost every single company can benefit from personalization using AI.”

Yan Wu, SoFi


ANSWERS: What are some of the issues in machine learning that you are working to solve for right now?

WU: I think where machine learning plays the biggest role is in datasets that have extremely high numbers of dimensions, very low signal ratios and very sparsely populated values. For example, people in lending use data from the bureau. There are millions and millions of rows, there are thousands and thousands of columns. Each specific field has very little to no signal and each person has very few things that are actually populated. Those are opportunities where machine learning, particularly deep learning, has an extremely high potential.

ANSWERS: How can machine learning assist with lending decisions, and how does one keep bias from creeping into that?

WU: When making a decision on creditworthiness, machine learning can help lenders look at metrics beyond FICO and income. Whether it’s adding more information to traditional metrics versus determining creditworthiness of applicants without a full credit history, machine learning can drive tighter risk management while assessing borrower’s creditworthiness where traditional models cannot.

With regards to bias we must remember that the real world is a biased place, so the fact that models pick up on these biases should not be surprising. Whether it’s latent bias, selection bias, or confirmation bias, algorithms will often magnify these biases. Therefore, rather than seeing models as a threat, we should recognize this opportunity to use models to detect and to correct biases.

ANSWERS: Do you think that biases uncover themselves once you get into pulling the datasets? Meaning, developers see the trends playing out and then that helps them realize that there might be a problem, that they need to go back and reconfigure the factors that they’re focusing in on?

WU: I think that if it’s too good to be true, there’s probably a problem. The dataset that you’re working with, you have to understand how it was curated, how it was collected, how it was produced, and what type of segmenting occurred before you got your hands on it. That will tell you a lot about the general construct of what’s in that dataset, and it’ll help you figure out the likelihood of overfitting that dataset for the specific needs that you have. Overfitting, a lot of the times, comes from systemic problems within the data. I would just develop an incredibly focused view on how that data was produced.

ANSWERS: How do you see cloud technology complementing the development of AI, and what benefits can be derived?

WU: With any R&D effort the more iterative testing, the better. The most important thing that cloud computing has done is make incredibly high-powered machines available for testing. What happens is cycle times become shorter and iterations become quicker. If you look at innovation as a function of the number of iterations, it just drives innovation time down incredibly. Therefore, the amount of innovation that’s happening due to cloud computing has been tremendous.

ANSWERS: Does the cloud enable more data collaborations, more data sharing partnerships? Do you see that it’s allowing you as an AI developer access to more data?

WU: With the cloud, it’s definitely much easier to share data and collaborate. When I started back in 2005-2006 in quantitative finance, we would get a DVD full of data from a vendor on a monthly basis. I would then take that to a database administrator to load it into the database internally. The whole process took a week to process between delivery and load time.

What we’re able to do now is to have a shared cloud account whereby the someone populates the table when the data is ready, and collaborators can access it immediately. In terms of development costs, hardware costs, and sheer number of people that need to make it successful, it’s much more efficient.

Taking the example of quantum physics in academia, what you have is a very geographically disparate group of people from around the globe being able to iterate on ideas, collaborate on the process and share work in real time. Cloud technology enables them to collaborate with anybody, anywhere, with a real-time ability to share and without friction.

ANSWERS: In what ways is artificial intelligence development ahead of its projected curve and in what ways do you think its behind?

WU: That’s a tough question to answer. If you think about constrained, repeatable and scalable situations, it is tremendously ahead. If I said to you today that I’m going to take out my phone, push a button and five minutes later a stranger is going to show up with a car and I’m going to get in and without saying another word or doing another action I’m going to show up where I’m supposed to be, you wouldn’t be surprised. However, if I told you that ten years ago, you would not believe me. That’s where it’s miles ahead of probably where we expected.

Where we’re behind are things that replicate the human brain and the more that we try to do things that require interpretive ability, the more we realize just how advanced our brains are. That’s probably how I would differentiate where we’re ahead and where we’re behind.

ANSWERS: What key business areas do you see artificial intelligence improving revenue performance for organizations in the near future?

WU: I believe it is through personalization with a customer experience. If we’re able to tailor an experience to a particular person based on what data we have on them and how we’re able to infer information from that data, the better it gives people an experience that’s personable. I think that’s where you’re going to get all of the tails in the distribution to convert with your product. That’s where you’re going to see a non-linear growth in terms of your addressable populations and, therefore, your revenue. Almost every single company can benefit from personalization using AI.


Learn more

In our new series, AI Experts, we interview thought leaders from a variety of disciplines — including technology executives, academics, robotics experts and policymakers — on what we might expect as the days race forward towards our AI tomorrow.

  • Facebook
  • Twitter
  • Linkedin
  • Google+
  • Email

More answers