Mission

Search

Data Lakes, Blockchain and Employee Experience with AWS Executive Rahul Pathak

Play episode

Rahul Pathak’s experience in the tech industry dates back to the early days of the internet boom and he’s been riding the wave, monitoring trends ever since. The journey has allowed Rahul to be on the cutting edge of all kinds of new technology, including what he’s doing now with big data and blockchain. Rahul currently serves as the GM of Big Data, Data Lakes, Blockchain at Amazon Web Services and on this episode of IT Visionaries, he discusses why those technologies are the future. 

Best Advice: “Ask a lot of questions. And ask ‘why,” a lot.”

Key Takeaways:

  • What the blockchain offers 
  • Who are the stakeholders utilizing these technologies
  • A crash course in data lakes

Joining AWS 

Rahul joined AWS in 2011 and he has been working on all things data-related ever since. He was curious and passionate about providing customers access to large scale databasing and computing in ways that were never available before. Eventually, the idea of blockchain started coming into the news, and Rahul saw an opportunity to work with the technology.

Why you should be excited about blockchain

There has been a lot of hype about blockchain, but Rahul was always worried about jumping the gun. Rather than rush out a service to be a part of the trend, Rahul and his employees watched the space and spoke to consumers about the way they were using blockchain or how they could use it to increase efficiency or productivity. It wasn’t until they had enough data to support the existence of a real opportunity that they went about creating a product that utilizes the blockchain. Customers had two common use cases: customers who wanted a ledger owned by a single entity and the ability to track and modify it in a way that couldn’t be tampered with. But didn’t need any distributed trust. The second was when customers had a connected network of partners and they wanted to be able to independently audit the transactions taking place between them and not have any one person having control of the record. Enterprise blockchain allows participants to agree on how things are recorded and then puts them on a distributed, trusted ledger, so that created an opportunity to introduce a solution to the problem Rahul was seeing. Rahul gave examples such as the DMV, which is looking to maintain a secure audit log as an organization that could benefit from having a ledger on the blockchain. In the past, you would have to use databases and build intricate scaffolding around those databases to keep them secure and functioning, but even those had weaknesses. Blockchain was able to solve for those weaknesses.

“We were just paying attention to what was happening in the industry around blockchain. And we really wanted to learn from our customers about what problems that really solved for them rather than just get involved because of all the hype and, as we learned more about what customers need, we realized that there was a real opportunity for some use cases that were uniquely served by blockchain.”

“Imagine something like a DMV that’s trying to track vehicle ownership and registration history. And they wanted the ability to track that in a way that couldn’t be modified or tampered with. And so that was a case where they needed a verifiable, tamper-proof record of what had happened, but they didn’t need any distributed trust. So that was really about centralized use of immutable records.” 

“The second use case was where we found the customers actually did have a connected network of partners and they were, they wanted to be able to independently verify and audit what had taken place in terms of the transactions between them. So in this scenario where you’ve got multiple parties participating in, you didn’t really want any one person to control the record of what had happened. That was really the place where we saw these enterprise blockchain frameworks playing a role. So those allow customers to agree upon how they decide that transactions are valid and then record them in a ledger that’s distributed to all of the participants.”

Who are the stakeholders?

Rahul has a pool of millions of customers to tap into, some of whom are utilizing the blockchain service. But there are also net new customers who are only interested in the blockchain offerings now available in the market. No matter who the customer is, though, it’s critical to have an open feedback loop with them to constantly offer the right products and solutions tailored to their needs.

“I spend a significant portion of my time with customers or with partners who are implementing solutions just to really understand what they’re trying to accomplish and making sure that we have a tight loop of feedback between the customers and ourselves. That really helps us make better decisions about how to evolve our products and services. But also this is a relatively new category for everybody. And so learning about what’s working and where customers need help, that’s incredibly valuable.”

The world of data lakes

Data lakes are a place where customers can bring all of their data, regardless of how it’s formatted or where it came from, and it can be stored and you can set parameters on who can access it and use it. Data lakes are different than databases because in a database, the data was structured in tables and you had to decide what to keep. A lake offers much more freedom to store anything and everything. And in today’s world, where the amount of data is exploding and the type of data is also changing to be more unstructured, the need to store all that data somewhere is a major pain point for numerous organizations. 

There are many pros and challenges with data lakes. For example, it’s easier to get data in, but it can be a little harder to make sense of what you have. People want to understand what data they have and be able to broadly give access to the data to advance their company, so there are technologies being offered to make that possible. Data transformation is another perk of a data lake. You can automatically transform your data into what you need it to be. You can eliminate duplicates in your data set as well so you can clean up your data.  

Rahul believes that the use cases for data lakes are going to continue to explode, so there will be more movement toward the technology. There will also be an opportunity to bring more machine learning and automation into the fold. 

“I think the use cases are really only going to continue to explode. If you think about it on the data side, we’re continuing to generate more and more data. Today where we all have cell phones and smart devices, but things like smartwatches are now a huge category. You’re going to have sensors embedded in more and more things. And so that volume of data that’s coming is going to continue to grow. The other thing that we’re seeing is the need to protect data and to understand who accessed it is also going to continue to grow and evolve. And you can see that with the various regulations that are popping up around the world around privacy and around who has access to what. So what you have is data volume exploding. You have the need for more control of data, but I think you also see customers realizing that data is a key strategic asset for their businesses. And so they need to be able to operate with data and use it to help them prove their businesses and drive competitive advantage while still complying with everything that’s related to regulation and data protection. And so I think the use cases are many fold and they will grow dramatically.”

Employee experience

Rahul believes that this technology can help employees as well as it does customers. Using ledgers and data lakes, employees can look at how resources are being used and have easy ways to access and understand data. Applying all of these technologies internally is beneficial because it allows the opportunity to iterate faster and provide more value to customers. 

Mentions:

Menu

Episode 114